diff options
| author | S. Solomon Darnell | 2025-03-28 21:52:21 -0500 |
|---|---|---|
| committer | S. Solomon Darnell | 2025-03-28 21:52:21 -0500 |
| commit | 4a52a71956a8d46fcb7294ac71734504bb09bcc2 (patch) | |
| tree | ee3dc5af3b6313e921cd920906356f5d4febc4ed /.venv/lib/python3.12/site-packages/striprtf-0.0.28.dist-info/METADATA | |
| parent | cc961e04ba734dd72309fb548a2f97d67d578813 (diff) | |
| download | gn-ai-master.tar.gz | |
Diffstat (limited to '.venv/lib/python3.12/site-packages/striprtf-0.0.28.dist-info/METADATA')
| -rw-r--r-- | .venv/lib/python3.12/site-packages/striprtf-0.0.28.dist-info/METADATA | 66 |
1 files changed, 66 insertions, 0 deletions
diff --git a/.venv/lib/python3.12/site-packages/striprtf-0.0.28.dist-info/METADATA b/.venv/lib/python3.12/site-packages/striprtf-0.0.28.dist-info/METADATA new file mode 100644 index 00000000..7e7df6a4 --- /dev/null +++ b/.venv/lib/python3.12/site-packages/striprtf-0.0.28.dist-info/METADATA @@ -0,0 +1,66 @@ +Metadata-Version: 2.1 +Name: striprtf +Version: 0.0.28 +Summary: A simple library to convert rtf to text +Home-page: https://github.com/joshy/striprtf +Author: Joshy Cyriac +Author-email: joshy@posteo.ch +License: BSD-3-Clause +Download-URL: https://github.com/joshy/striprtf/archive/v0.0.28.tar.gz +Keywords: rtf +Platform: UNKNOWN +Classifier: License :: OSI Approved :: BSD License +Description-Content-Type: text/markdown +License-File: LICENSE + +# striprtf + + +## Purpose +This is a simple library to convert Rich Text Format (RTF) files to python strings. +A lot of medical documents are written in RTF format which is not ideal for parsing +and further processing. This library converts it to plain old text. + +## How to use it +```python +from striprtf.striprtf import rtf_to_text +rtf = "some rtf encoded string" +text = rtf_to_text(rtf) +print(text) +``` + +If you want to use a different encoding than `cp1252` you can pass it via the `encoding` +parameter. This is only taken into account if no explicit codepage has been set. +```python +from striprtf.striprtf import rtf_to_text +rtf = "some rtf encoded string in latin1" +text = rtf_to_text(rtf, encoding="latin-1") +print(text) +``` + +Sometimes UnicodeDecodingErrors can happen because of various reasons. +In this case you can try to relax the encoding process like this: +```python +from striprtf.striprtf import rtf_to_text +rtf = "some rtf encoded string" +text = rtf_to_text(rtf, errors="ignore") +print(text) +``` + +## Online version +If you don't want to install or just try it out there is an [online version](https://striprtf.dev) available. + +## PostgreSQL +There is also a [PostgreSQL version](https://github.com/MnhnL/pg_striprtf) available from [Raffael Mancini](https://github.com/raffael-mnhn). + +## History +[Pyth](https://github.com/brendonh/pyth) was not working for the rtf files I +had. The next best thing was this gist: +https://gist.github.com/gilsondev/7c1d2d753ddb522e7bc22511cfb08676 + +~~Very few additions where made, e.g. better formatting of tables. ~~ + +In the meantime some encodings bugs have been fixed. :-) + + + |
