Advent of (MemGraph)RAG (1) - Do you need a framework?
You want to do GraphRAG. Do you need a framework?
We did about 100 calls this year with teams who want to build GraphRAG. The pattern is always the same: everyone wants to do it, but most don’t know where to start.
The biggest blocker is the feeling that you need to “get it right” on the first try. People get stuck deciding if they should build it all from scratch or pick a framework that promises to do it all.
Here is the reality: No framework is going to save you from your own business logic.
In my experience, everyone ends up writing custom pipelines anyway. Your data is weird, your rules are specific, and a generic framework rarely fits perfectly out of the box.
To actually move forward, you only need to figure out three things:
Ingestion: How do you turn your files into a graph?
Retrieval: How do you find the right data later?
The Tech: What database are you going to store your structured and unstructured data in, and can that engine actually back your retrieval methods?
You need to get these three things running end-to-end as fast as possible. It doesn’t have to be perfect. You just need to get it in front of users or your team to see if the answers are actually good.
Why I think you should prototype first I honestly think hacking together a prototype yourself is better than marrying a framework on day one.
It’s faster: You can stitch code together to fit your exact problem.
It’s all R&D: GraphRAG is still new. It’s high risk, high reward. You need to be able to change things quickly.
You can switch later: Once you know what works for you, it’s easy to move that logic into a framework later.
Frameworks are new and changing: All these are super new and constantly changing and improvement. Are you really wanting to lock-in yourself with something so dynamic? Even some of them don’t know what the heck are they doing. But all of them will be pitching to you in order to use them, in order to improve themselves. However, if you’re a big enterprise, maybe that’s an overkill on your side. Stick to your pace!
How we are doing it at Memgraph We decided to build a toolkit to help with this. We didn’t want to reinvent the wheel, so for unstructured data we built Unstructured2Graph using libraries that already work well:
unstructured.io for handling the messy data.
LightRAG to build the knowledge graph.
On the flip side, we also have sql2graph. People often forget that “talking to your data” isn’t just about reading PDFs. You usually need your structured SQL data mixed in there too. Basically if the most important data lied in PDFs, it would have been already converted to structured data long time ago. But still, unstructured is valuable because it’s expressive and people didn’t know a meaningful way to gain productivity from it until LLMs came.
The Setup For retrieval, we see people using MCP servers to chat with their documentation.
Then, you add the “sugar on top”: Memgraph. You get vector search and text search ready for production, plus the knowledge graph to find related connections. This helps you find things that simple vector search misses.
Putting it all together This combination of unstructured2graph, sql2graph, and our MCP integration forms the base of Memgraph’s AI Toolkit. The goal is to provide solid ingestion and retrieval mechanisms that work out of the box, but without locking you into a black box.
Conclusion: Start Now So, where do you start? Prototype fast. Don’t worry about the perfect architecture yet. Just figure out where your documentation lives (is it a list of URLs? A folder of PDFs?) and get it ingested.
For retrieval, don’t rely on just one method. The secret sauce is combining vector search, text search, and relevance expansion (I’ll dive deep into exactly how to do this in the next chapters).
Once you have a solid foundation for your documentation, the leverage is huge. You can suddenly manage 10x more customers without hiring more support staff. In our case at Memgraph, this shift allows us to stop just “answering questions about the docs” and start helping users actually solve their complex use cases.
