Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading Warodai (UTF-16 text, Japanese-Russian) #569

Open
GrimPixel opened this issue Jun 15, 2024 · 6 comments
Open

Support reading Warodai (UTF-16 text, Japanese-Russian) #569

GrimPixel opened this issue Jun 15, 2024 · 6 comments
Labels

Comments

@GrimPixel
Copy link

Would you consider supporting the txt format of Warodai?

@ilius ilius added the Feature label Jun 16, 2024
@ilius ilius changed the title Support Warodai Support reading Warodai (UTF-16 text, Japanese-Russian) Sep 12, 2024
@banditto9
Copy link

Hello and many thanks for your wonderful tool!
Let me join to the topic starter as Warodai dictionary in text format is the most updated type and a universal one.
Personally I'm looking forward for this support as my target conversion to epub and mobi so to use it on Kindle ereader.
Thank you.

@banditto9
Copy link

Let me put small addition here. Warodai project also publishes a link to a contributor who makes an EDICT format from the text: https://github.com/update692/warodai-to-edict As a result we get "output.txt" which looks close to a CSV file but with "/" delimeter. This makes the conversion easier I think. Support to an old but still existing EDICT (not EDICT2) format can be added to the PyGlossary project at once. Thank you for your time and attention.

@soshial
Copy link
Contributor

soshial commented Oct 30, 2024

Just use the StarDict version and convert it to the desired format, @banditto9

@banditto9
Copy link

@soshial hello and thanks for your tip. I know it exists, but the version is old enough - txt is updated on more frequent basis (months vs years) as 3rd party conversions are rarely performed.

@soshial
Copy link
Contributor

soshial commented Oct 30, 2024

Noone is going to develop for a single 1-off case, especially for free. It's simply not realistic

@banditto9
Copy link

@soshial I see, I'd be happy to make private fork, but, unfortunately I'm not a developer. In the meanwhile I was able to make some kind of a conversion via https://github.com/update692/warodai-to-edict to one line and then do separation in Excel to CSV. Hope this will be helpful somehow to the topic starter. It worked but lack of formatting makes it not very usable as everything goes as one line with no \n or \r\n.

What can be added so as not it to be a "1-off case" is the text "\r\n separator" which is missing now. As a "vertical" kind of alternative to the existing "horizontal" CSV text option. I understand this is still not a common standard for plain text dictionaries but may exist outside the Warodai.

Screenshot 2024-10-31 094418
Screenshot 2024-10-31 095528

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants