Documentation indexing & search improvements

by Marcin Sągol

What is your idea about?
We are aiming to improve documentation indexing and search as well as improve the user interface. We would like to have a final UI similar to GitHub one, with additional, more detailed filtering options. We proposed those changes to the Documentation team and we are willing to collaborate with them to work out final solutions.

Discussion about those changes taken place here: add automatic filter to search results with documentation you have currently open · Issue #47 · TYPO3-Documentation/t3docs-search-indexer · GitHub

We want to prepare API and indexing for this kind of solution: add automatic filter to search results with documentation you have currently open · Issue #47 · TYPO3-Documentation/t3docs-search-indexer · GitHub

What do you want to achieve by the end of Q3 2024?

We want to prepare a technical draft of how the whole search process should look, what filters we will offer, how they will work, etc. - this should be the result of collaboration with the Documentation team.

Extend/modify how and what data we are indexing, as we will need more detailed information to be indexed to support extended suggestions. We would like to cooperate with elastic search experts on preparing schema/structure for indexing.
Implement on the backend side advanced suggestions handling (endpoints to handle requests from the frontend), where we receive filters from the frontend, parse them, and build an ES query to fetch suggestions, then return them to the frontend.
The first step would be consultations with the TYPO3 Documentation Team to work out all the filters and how they will work for users when searching through the documentation.
Here add automatic filter to search results with documentation you have currently open · Issue #47 · TYPO3-Documentation/t3docs-search-indexer · GitHub, I have posted an image (draft) of how the search form could look for a cursor, but this needs to be clarified and agreed upon with the documentation team.

Step 1: Consult and agree with the TYPO3 Documentation Team on how the search form should look and, more precisely, what functionalities and capabilities it should offer for users.

Next, once we have defined the requirements for the search capabilities, we need to properly design the index - what data will be stored and how.
We want to continue using Elasticsearch as it is currently in use. In cooperation with an Elasticsearch expert, we aim to design the index base and determine what data will be indexed and how.
After completing this part, we will develop code to iterate over the documentation files and index them properly in an optimized way.
This will enable us to easily search, generate suggestions, and provide search results based on the filters defined in step 1.

Step 2: Design the index structure in cooperation with an Elasticsearch expert to efficiently index all data from the documentation files. This will ensure the search functionality generates the desired suggestions and search results, fulfilling the capabilities of the frontend search form defined in step 1.

Now, with the working index and data indexed, we can start designing the API endpoints that will allow searching the index from the frontend search form.
As a result, we should have a functional PHP API with all the endpoints required to make the search form offer the full functionality declared in step 1.

Step 3: Create all the required API endpoints to fulfill the needs of the search form, ensuring it works and offers the functionalities declared in step 1.

In general, in Q3, we are focusing solely on the backend development of a new and improved TYPO3 documentation search engine.
After completing this, the final part (likely scheduled for Q4, if approved) will be to build the frontend interface.

What is the potential impact of your idea for the overall goal?

End-users would benefit from this initiative because it would be easier to find information in TYPO3 documentation. You would be able to use filtering, so you will receive more tailored results and it would be easier to receive more precise search results.

Which budget do you need for your idea?
10.000 Euro

Please note: After the start of the voting we can not change the idea description nor the idea outcome. If this idea is selected by the members, it must be archived as described.

4 Likes

The competition was strong in Q2 (as it is now :wink: but I really want to highlight that documentation indexing and searching is vital to be improved so that new people working with TYPO3 can easily find what they are looking for. If we do not offer a good way to get deep into TYPO3, we cannot acquire new contributors and keep the ecosystem and improvements going.

What I think a lot of “end users” of the documentation are finding difficult is to understand the different “Books” we have.

A filter section that would reveal this structure would be very helpful to make people get to the information they are searching for faster, and not get overwhelmed by a large result list. An initial overlay to the search would be a great “landing zone” for them. Restriction the search to a version number directly would also be very helpful.

3 Likes