The LLM stack
A good discussion on the emergent LLM App stack.(In a way, the AI stack, as everything AI seems to be LLMs now a days). https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/ VCs have already made their bets in most of the layers of the stack. let us look at the stack in detail and why we need the layers.
Fig: from a16z blog
Data Pipelines and APIs/plugins have already existed. But for giving meaning to the data, we need embeddings. For embedding models OpenAI and Huggingface are the leaders with Cohere coming in now.
Personally I have not seen a huge difference in the embedding quality for general sentences. All of them have the same issues(negatives sentences). Some have larger size. But this has actually become a commodity as the cost of generating an embedding has become very very cheap.
Because of the size of the embeddings and the importance of embeddings, VectorDBs are needed to manage them. In the recent past this layer of stack has seen the most VC activity. Pinecone, Weaviate, Chroma, Vespa, Milvus, Pgvector, OpenSearch. Everyone is a VectorDB now. I guess because this layer of the stack is more of software development than actual ML( and these DBs are providing wrappers to the actual search algorithms), there are more people competing in this segment.There is no clear winner.Just from a feature perspective I like Vesa and Milvus. Vespa for its attention to detail and Milvus because it supports bit embeddings.
Since the stack has many layers now, we need some orchestration tools. For orchestration Langchain, LlamaIndex are good bets. Though with the release of functions, OpenAI itself provides the orchestration.(personally, I am in the DIY camp) Just like all software we need cloud providers and here the big 3 are there as usual.
For the actual AI API there is only one winner as of now, OpenAI. Google has released Vertex APIs and Anthropic is also there. But OpenAI is just too far ahead in API access right now.