Stealth • 2m
Not fine tuning or pre training. It’s the way of changing pre processing step. What we do now is chunk the data, create embedding then store them in vector db and query it and feed it to llm for response. This has more error prone due to complex documents, like tables, graph elements etc to solve this chunking Doesn’t make as it misses the content by breaking the text of each document text. So changing this with other approach solves the accrueacy in retrieval.
Download the medial app to read full posts, comements and news.