Notion Answers

Help between Notion users


Register & Ask

It's free & easy

Get answers

Answers, votes & comments

Vote and select answers

Receive points, vote and give the solution

Question

1vote

I want a count of unique word occurrences. How is this done?

I have a table of documents (pages) with a text field. I want to collect that text as unique terms (with a count of occurrences and where they occur would be better) into one or two fields, either in that database or in another database.

How is this done? Does anyone know?

0vote

polle commented

Please extend a bit the information as it is not clear.

You have a database.
That database has 10 items. (pages)
In the same database there is a property (column) that is text.
Or that text is somewhere else? where?
In that property you add different words.
You want to count how many times you added that word in that column.
Or are those tags?
What and where are the called "one or two fields"?

Please try to explain to understand the structure and see which one is the best solution.

For other questions please open new questions.

0vote

adam-notionuser commented

I have a tentative answer.
First of all, to the questions above, from polle.
"In the same database there is a property (column) that is text." Yes.
That property is populated with text.
Actually, the text represents the subjects covered or key terms in the main page data.
Remembering that each line in the database is a page with properties that appear in the database view, plus (optional) page text.

"You want to count how many times you added that word in that column."
Yes, I want to count how many times I have added a unique word or, preferably, a combination of two or three words that represent one of the subjects (terms) I just mentioned.

"What and where are the called "one or two fields"?"
These are the results of the calculation, so unique terms in one, a count of them in another.

To my answer.
I don't find it satisfactory.
I would have thought there is a way of creating a property (in a view of another database) that simply aggregates all the terms from the columns in question, recognising separators to those terms and then giving a count of those terms in another field.

I cannot understand how to do this.

What I have done is added two more columns.

One is called "Text", and the other is called "Formula", just to help us be clear, the names are irrelevant.

Text contains this formula, because I wanted to add two fields into one:-

join("", prop("AI key info"), prop("AI Custom Autofill"))

Formula contains this formula, which includes a regex:-

length(replaceAll(prop("Text"), "([A-Za-z])\w*", "-"))

So this is a half solution.
It doesn't recognise the terms as made up of two or three words. I could mess around further, but it seems unsatisfactory.

It doesn't accumulate all of the text together into one final field, part of the original problem.

0vote

polle commented

I am not understanding what you are trying to do and the terms you are using.

Database: A Notion database.
Property: A column in that main database. Not in the pages.
Pages: A Notion item in the database.
Page content: The content you add inside a page, in the body of the page.

If you want to count how many times you write a word inside the "Page content", then it is not possible.

If you want to add text in a "Property", then maybe it is better to use tags and count that.

Then you add formulas to join, count length and replace, which have nothing to do with the main question.

Please try to explain the scenario of what you are trying to achieve using this terms, to understand where and how you are adding the content.

0vote

adam-notionuser commented

I have a tentative answer.
First of all, to the questions above, from polle.
"In the same database there is a property (column) that is text." Yes. As elsewhere, this refers to a Notion database.
That property is populated with text. (Property: A column in that main database. Not in the pages.)
Actually, the text represents the subjects covered or key terms in the main page data.
Remembering that each line in the database is a page (Pages: A Notion item in the database) with properties that appear in the database view, plus (optional) page text (Page content: The content you add inside a page, in the body of the page).

"You want to count how many times you added that word in that column."
Yes, I want to count how many times I have added a unique word or, preferably, a combination of two or three words that represent one of the subjects (terms) I just mentioned.

"If you want to count how many times you write a word inside the "Page content", then it is not possible."
I can see that.

"If you want to add text in a "Property", then maybe it is better to use tags and count that."
I use a succession of properties.
I add text using AI that is related to the page content.

I could use tags, but that is a manual process and relies on knowing what the tags should be and where they should be applied.

"Then you add formulas to join, count length and replace, which have nothing to do with the main question."

Oh, but something I didn't do is look at the "Calculate" field at the bottom of any database.
I didn't think to check how to use formulas there. That could help me at least a bit if all the formulas including regex can be used there.

1 Answer

0vote

adam-notionuser Points170

Well, of course, that doesn't work, as Calculate is very restricted.

I have set up something with a Python API package, notion-client.
This one seems good enough:-

Then I can get a JSON string result back of the whole database, but perhaps I can control that, though.

I can search within the string like this:-

pprint(my_page["results"][0]["properties"]["Keyword Autofill"]["rich_text"][0]["text"]["content"])

Which gives me more or less what I want. In my case, it is this, with each comma-separated group the phrase I'm interested in:-

('Keywords: LaMini, LLM, LangChain, transformers, pipelines, summarization, '
 'Google Colab, AI, models, code snippets, python, diagrams, installation, '
 'text summarization, Hugging Face pipelines, text splitter, CPU, document '
 'interaction, distilling knowledge, fine-tuning, large language models, NLP '
 'benchmarks, encoder-decoder, decoder-only families, Devansh, machine '
 'learning, Bigger models, data processing, curation, solid dataset, Markdown, '
 'PDF, video transcript, Youtube subtitles, interactive documents, text '
 'documents, copy/paste, vectors, vector store indexes, database, Jupyter '
 'Notebook, dependencies, faiss-cpu, sentence_transformers, chromadb, Cython, '
 'tiktoken, rich, SSL, pytube, YouTube, SRT format, regex function, text '
 'cleanup, textwrap, API, access token, text files, local machine, evaluation, '
 'document loaders, character text splitter, HuggingFace embeddings, FAISS, '
 'ChatGPT, Prompt, Feature, Risk, AWS.')

From which I can do what I want, I expect.

Please log in or register to answer this question.

...

Welcome to Notion Answers, where you can ask questions and receive answers from other members of the community.

Please share to grow the Notion Community!

Connect