OneNote
.
Prerequisites
- Register an application with the Microsoft identity platform instructions.
- When registration finishes, the Azure portal displays the app registration’s Overview pane. You see the Application (client) ID. Also called the
client ID
, this value uniquely identifies your application in the Microsoft identity platform. - During the steps you will be following at item 1, you can set the redirect URI as
http://localhost:8000/callback
- During the steps you will be following at item 1, generate a new password (
client_secret
) under Application Secrets section. - Follow the instructions at this document to add the following
SCOPES
(Notes.Read
) to your application. - You need to install the msal and bs4 packages using the commands
pip install msal
andpip install beautifulsoup4
. - At the end of the steps you must have the following values:
CLIENT_ID
CLIENT_SECRET
🧑 Instructions for ingesting your documents from OneNote
🔑 Authentication
By default, theOneNoteLoader
expects that the values of CLIENT_ID
and CLIENT_SECRET
must be stored as environment variables named MS_GRAPH_CLIENT_ID
and MS_GRAPH_CLIENT_SECRET
respectively. You could pass those environment variables through a .env
file at the root of your application or using the following command in your script.
onenote_graph_token.txt
) at ~/.credentials/
folder. This token could be used later to authenticate without the copy/paste steps explained earlier. To use this token for authentication, you need to change the auth_with_token
parameter to True in the instantiation of the loader.
🗂️ Documents loader
📑 Loading pages from a OneNote Notebook
OneNoteLoader
can load pages from OneNote notebooks stored in OneDrive. You can specify any combination of notebook_name
, section_name
, page_title
to filter for pages under a specific notebook, under a specific section, or with a specific title respectively. For instance, you want to load all pages that are stored under a section called Recipes
within any of your notebooks OneDrive.
📑 Loading pages from a list of Page IDs
Another possibility is to provide a list ofobject_ids
for each page you want to load. For that, you will need to query the Microsoft Graph API to find all the documents ID that you are interested in. This link provides a list of endpoints that will be helpful to retrieve the documents ID.
For instance, to retrieve information about all pages that are stored in your notebooks, you need make a request to: https://graph.microsoft.com/v1.0/me/onenote/pages
. Once you have the list of IDs that you are interested in, then you can instantiate the loader with the following parameters.