Chaarangan
d0d5305e5b
Improved Tamil and Sinhala traineddata
2021-09-01 14:48:52 +05:30
Stefan Weil
e2aad9b983
ita: Remove ita.config from ita.traineddata
...
It added a user_words_suffix which should be reserved for
user configurations.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-11-30 22:03:13 +01:00
zdenop
9e8aeef07c
Merge pull request #47 from SherSpock/patch-2
...
Update README
2020-03-09 08:28:45 +01:00
Ryder Timberlake
d288680f57
Update README
...
Replace unsupported wiki link with equivalent hosted doc link
2020-03-08 17:07:13 -04:00
Stefan Weil
c5e0a7294a
Update tessconfigs
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-23 13:32:42 +02:00
Stefan Weil
e4173f4456
Update URL for tessconfigs submodule (use HTTPS)
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-10-11 13:08:43 +02:00
Stefan Weil
41e829655f
Add tessconfigs submodule and links for required tessdata files
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-09-03 16:07:05 +02:00
zdenop
e9f15884bc
Merge pull request #37 from stweil/master
...
Fix extra intra-word spacing for several Asian languages (GitHub issue #991 )
2019-05-22 12:15:06 +02:00
Stefan Weil
ea00692e71
Fix extra intra-word spacing for Thai (GitHub issue #991 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 17:50:06 +02:00
Stefan Weil
80b4d76313
Fix extra intra-word spacing for Japanese (GitHub issue #991 )
...
Fix also the encoding of tessedit_char_blacklist.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 17:49:35 +02:00
Stefan Weil
5075f27776
Fix extra intra-word spacing for Chinese (GitHub issue #991 )
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-05-21 17:48:52 +02:00
zdenop
95593f0b01
Merge pull request #33 from stweil/master
...
Improve documentation
2018-10-23 16:57:34 +02:00
Stefan Weil
2d255780f3
Improve documentation
...
These models don't work with old versions of Tesseract.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-23 16:36:18 +02:00
zdenop
f8c44498f3
Merge pull request #28 from stweil/master
...
Remove parameter textord_tabfind_vertical_horizontal_mix
2018-05-28 16:11:44 +02:00
Stefan Weil
786983dddb
Remove parameter textord_tabfind_vertical_horizontal_mix
...
It was added to Tesseract in 2010 and removed in 2018, but never used.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-05-28 15:57:28 +02:00
zdenop
ce12640701
Merge pull request #26 from Shreeshrii/master
...
correct name kur_ara to kmr - Kurmanji (Latin script)
2018-04-25 19:31:01 +02:00
Shree Devi Kumar
788e2fe923
correct name kur_ara to kmr - Kurmanji (Latin script)
2018-04-25 22:47:45 +05:30
zdenop
09a3a39156
Merge pull request #25 from Shreeshrii/master
...
Fix config file for Korean, remove `tessedit_load_sublangs chi_tra`
2018-04-09 19:49:25 +02:00
Shreeshrii
c7c86bb8de
Fix config file for Korean, remove tessedit_load_sublangs chi_tra
...
Addresses https://groups.google.com/d/msgid/tesseract-ocr/1e5142e1-d198-46d3-95ee-1a3206d1a2c4%40googlegroups.com?utm_medium=email&utm_source=footer
2018-04-09 19:58:26 +05:30
zdenop
7a1c6b06d7
Merge pull request #21 from stweil/script
...
Move trained data for scripts to new subdirectory
2018-03-10 21:29:59 +01:00
Stefan Weil
a2f7ced76b
Move trained data for scripts to new subdirectory
...
This fixes a name conflict for Lao.traineddata and lao.traineddata
which could not be distinguished on case insensitive filesystems
(for example macOS, Windows).
It makes it also easier for users to see which data is for scripts.
Choosing a script works now like this: tesseract -l script/Latin ...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-03-10 21:12:04 +01:00
zdenop
51ebb64c29
Merge pull request #19 from stweil/master
...
Add Devanagari config file to fix auto PSM issue #1273
2018-02-27 08:25:38 +01:00
Stefan Weil
84bd10ed89
Add Devanagari config file to fix auto PSM issue #1273
...
Devanagari.config was copied from tessdata_fast.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-02-27 07:33:28 +01:00
zdenop
208f104882
Merge pull request #1 from stweil/master
...
Improve GitHub integration
2018-02-02 10:37:00 +01:00
Stefan Weil
e744fa9056
Rename license file
...
Tesseract uses the file LICENSE to show the Apache License,
so rename COPYING to LICENSE.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-02-02 10:18:00 +01:00
Stefan Weil
9963c18ace
README: Improve description and add link to Tesseract wiki
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-02-02 10:09:47 +01:00
Stefan Weil
fb9ae6ba2d
README: Add text from former COPYRIGHT and add links
...
Format also the text, so it looks nicer on GitHub.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-02-02 10:09:47 +01:00
Stefan Weil
4928952a62
Use the full Apache License text
...
Now GitHub is able to detect and show the project license.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-02-02 10:09:47 +01:00
zdenop
3e6ec162ae
Merge pull request #17 from stweil/deu
...
deu: Remove unwanted dependency
2018-02-02 10:02:28 +01:00
Stefan Weil
ed5410b928
deu: Remove unwanted dependency
...
The data included a configuration which required frk.traineddata
("tessedit_load_sublangs frk"). Remove that.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-02-01 15:29:03 +01:00
Jeff Breidenbach
f1d12682c0
Use legacy Orientation Script Detector (OSD) because that is the only thing that currently works.
2017-09-15 11:44:08 -07:00
zdenop
5cf1eaafa4
Merge pull request #3 from Shreeshrii/master
...
Fix Config files to LSTM only for nep and mar
2017-09-15 17:56:59 +02:00
Shreeshrii
9c5c2cb2e7
Fix Config files to LSTM only for nep and mar
...
Change default mode to
tessedit_ocr_engine_mode 1
2017-09-15 21:22:28 +05:30
zdenop
84ae67cd6f
Merge pull request #2 from Shreeshrii/master
...
Fix config files - Tesseract/LSTM combiner to LSTM only
2017-09-15 17:04:17 +02:00
Shreeshrii
09e4326246
Fix config files from Use Tesseract/LSTM combiner to LSTM only
...
Config files had tessedit_ocr_engine_mode 2
causing processing with --oem 3 (default mode based on config file) to fail
Failed loading language 'san' / 'hin'
Tesseract couldn't load any languages!
Could not initialize tesseract.
2017-09-15 18:37:50 +05:30
Jeff Breidenbach
c222ed852e
add license info
2017-09-14 15:04:55 -07:00
Jeff Breidenbach
9ddc24e750
Initial import (on behalf of Ray)
2017-09-14 14:45:10 -07:00
theraysmith
549354e9f1
Initial commit
2017-09-11 18:12:33 +01:00