Let’s write some code!

We’ll create a simple QA chat bot that indexes a PDF and a youtube video as it’s knowledge base. This will allow accurate answers.


Code Explanation:

  1. Imports: The code starts by importing necessary modules for configuration, vector storage, embeddings, document loading, text splitting, and the OpenAI SDK.
  2. Setting Up Variables:
const question = process.argv[2] || 'hi'
const video = `https://youtu.be/zR_iuq2evXo?si=cG8rODgRgXOx9_Cn`
  1. Create Memory Vector Store:
export const createStore = (docs) =>
  MemoryVectorStore.fromDocuments(docs, new OpenAIEmbeddings())
  1. Load and Split YouTube Video Text:
export const docsFromYTVideo = async (video) => {
  const loader = YoutubeLoader.createFromUrl(video, {
    language: 'en',
    addVideoInfo: true,
  })
  return loader.loadAndSplit(
    new CharacterTextSplitter({
      separator: ' ',
      chunkSize: 2500,
      chunkOverlap: 100,
    })
  )
}

  1. Load and Split PDF Text:
export const docsFromPDF = () => {
  const loader = new PDFLoader('xbox.pdf')
  return loader.loadAndSplit(
    new CharacterTextSplitter({
      separator: '. ',
      chunkSize: 2500,
      chunkOverlap: 200,
    })
  )
}
  1. Load Combined Document Store:
const loadStore = async () => {
  const videoDocs = await docsFromYTVideo(video)
  const pdfDocs = await docsFromPDF()

  return createStore([...videoDocs, ...pdfDocs])
}