
What is Scriptorai?
Scriptorai is an open source, community-oriented project which hosts first-pass, LLM-powered translations of public domain esoteric texts, seeking to make previously inaccessible texts browsable by human beings and real translators which can then be improved upon collaboratively. My first focus is to translate all available PDF volumes of the Catalogus Codicum Astrologorum Graecorum (CCAG), a 12-volume catalogue of all known astrological writings in Greek, published between 1898 and 1953 and considered the most important modern survey of Greek astrological writing. Currently, due to cost, only the first volume has been translated.
Catalogus Codicum Astrologorum Graecorum - Volume 1Why AI translation?
AI translation (with large language models) is far from perfect and cannot replace real skilled human translators, especially for obscure and niche topics like esoteric texts. However, it is much better at translation than you might think, good enough that it can let us get started. The preface to the first volume of CCAG refers to their decision to publish the volumes without every possible source with the Greek phrase Πλέον ἡμίσυ παντός, or "half is better than the whole". That is the spirit with which I have considered this project: it's not perfect, but something is better than nothing.
With LLM translation, we can do things that are inaccessible to individual human translators. To even understand and convey the topics of a single text would take a human translator a great amount of time and effort. The reason these texts have not already been translated is simply that the number of people with the interest, knowledge, and opportunity to translate texts like these are vanishingly few, numbering in maybe the dozens. By batch-translating these texts, we can at least make it easier to understand their contents and choose where skilled human effort should be applied to produce high-quality translations.
Some have criticized AI for its high energy usage, but these claims are usually overexaggerated and underresearched. This is a good resource for understanding the real impact of AI on the environment in terms of its energy usage. In any case, I feel that the benefits of making these texts accessible to English readers far outweigh the one-time cost of the energy used to translate them per text.
For a longer (but similar) post on my thought of the significance of the CCAG and why I'm doing this, see my Scriptorai announcement post on Substack.
Full text search
One major benefit is that with the full transcription produced, I have implemented a full text search feature that can help you find passages of interest in the texts. Try it out here!
Help with translation
If you spot an issue or have a correction for either transcription or translation, you can help suggest changes under any page by clicking the "Suggest Corrections on GitHub" button, which will allow you to contribute suggestions directly to the GitHub repository which contains the image, translation, and transcription files for the given text.
Open source and public domain
Considering that the source texts are in the public domain, it is important to me that all of the tools used to produce the transcriptions, translations, and this site, are open source and publicly available.
Scriptorai is fully open source, and the code for the project is available on GitHub at sadalsvvd/scriptorai. Scriptorai is this site itself, which is licensed under the MIT license.
The individual CCAG volumes' repositories and associated transcriptions and translations are licensed into the public domain under the Creative Commons Zero v1.0 Universal license. You can find the repository for the first volume of the CCAG on GitHub at sadalsvvd/scriptorai-ccag-01.
The tool used to produce the translations and transcriptions is called hemiplon, or "half-doubling", available on GitHub at sadalsvvd/hemiplon. It is also licensed under the MIT license.
Known Issues
- Occasional issues capturing footnotes
- Translation of sentences across pages does not work well yet; make sure to check sentences which continue across multiple pages
- Certain numerical values may be transliterated incorrectly, such as years and months being changed to incorrect decimal year values. Check these values carefully.
Contact
If you have any questions or feedback, you can email me at sadalsvvd@gmail.com, or @ me on Bluesky at sadalsvvd.space. You can find my personal website at sadalsvvd.space.