Ancestors

Written by manisha on 2025-01-06 at 15:53

Does anyone know of an #OpenAccess full-text #PDF #search engine/tool using which I can search for relevant PDFs from a self-hosted #database?

Context: we have a curated database of #research articles but so far our search capability has been limited to tagged keywords or title and abstract field search only. We'd like to be able to search the entire PDF.

Side note: I know that PDFs are not a great way to store scientific information. I'd prefer not to use a proprietary #LLM if possible

[#]LexicalSearch #SemanticSearch #AskAcademia #academia #science #sciences #ScienceMastodon #AskFedi #OpenScience

=> More informations about this toot | More toots from manisha@neuromatch.social

Written by El Duvelle on 2025-01-06 at 16:17

@manisha Not sure if that really fits what you want but #Zotero does full-text search of PDFs: https://www.zotero.org/support/searching

=> More informations about this toot | More toots from elduvelle@neuromatch.social

Written by manisha on 2025-01-06 at 16:30

@elduvelle thank you!! It looks like a good option. Someone else also suggested it. I've used it as a reference manager but never dived into its full-text indexing feature. Do you happen to know if Zotero libraries can be made public?

=> More informations about this toot | More toots from manisha@neuromatch.social

Written by El Duvelle on 2025-01-06 at 21:20

@manisha I haven't done it myself but it looks like it's just a matter of setting the library's visibility to "public": https://forums.zotero.org/discussion/88809/keep-public-library

=> More informations about this toot | More toots from elduvelle@neuromatch.social

Toot

Written by jonny (good kind) on 2025-01-07 at 00:22

@elduvelle

@manisha

Ya ya zotero would def be how I'd do it. Do you need it to be public so that it can be publicly full text searchable from a website or smth? Slightly different problem than being able to do it on your own zotero client but I'd still use zotero as the basis of the collection

=> More informations about this toot | More toots from jonny@neuromatch.social

Descendants

Written by manisha on 2025-01-07 at 15:52

@elduvelle thank you so much for that link!

@jonny yeah, need it to be publicly searchable which I think is doable...

=> More informations about this toot | More toots from manisha@neuromatch.social

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113784217926142965
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
301.580794 milliseconds
Gemini-to-HTML Time
1.947669 milliseconds

This content has been proxied by September (ba2dc).