# Literature Review

### **Literature Review**

Our paper builds on Neural Radiance Fields (NeRF), Infinigen and 3D-GPT.

\[22] NeRF factorisation addresses the challenge of recovering the shape and reflectance of multi-view images. NeRFactor converts a NeRF’s volumetric geometry into a surface representation before refining it without supervision.

\[19] The 3D-GPT framework introduces a novel approach to 3D modelling by leveraging Large Language Models (LLMs) like GPT-4. It segments complex 3D tasks, assigning specialised LLM agents to each. This guides our development of a multi-agent, modular pipeline for multimodal inputs.

\[17] The rendering component draws from Infinigen’s framework, where photorealistic 3D scenes and assets are procedurally generated using randomised mathematical rules and synthetic data. We utilise this framework for rendering intricate visual assets and generate synthetic datasets.

Nevertheless, existing research only handles text and image inputs for developing 3D scenes. Physics of the 3D scenes are also yet to be customisable. Thus, we will implement a fully multimodal framework that involves more varied inputs ranging from text, images, audio to video. This will be pre-trained on existing datasets. Our framework would be two-pronged: it would utilise an innovative technique called multi-modal radiance fields (MMRF) to render 3D scenes, followed by a customised physics engine enabling prompt-based physics. We developed the SIS metric to gauge our new unique model’s efficacy (see Appendix B).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://howllian27s-organization.gitbook.io/graphyti/literature-review.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
