tag:blogger.com,1999:blog-8623074010562846957.post1336361401518991578..comments2023-09-01T03:38:08.236-04:00Comments on Changing Bits: A new version of the Compact Language DetectorMichael McCandlesshttp://www.blogger.com/profile/04277432937861334672noreply@blogger.comBlogger19125tag:blogger.com,1999:blog-8623074010562846957.post-28719046161823114992016-02-25T20:43:05.877-05:002016-02-25T20:43:05.877-05:00I have created a managed library https://github.co...I have created a managed library https://github.com/diadistis/cld2.net and a nuget package https://www.nuget.org/packages/CLD2.Net/ in case anyone is still interested.Diadistishttps://www.blogger.com/profile/10662522077604679256noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-62461266915190562172016-02-08T09:14:49.669-05:002016-02-08T09:14:49.669-05:00When I try to run "NCLD2.LanguageDetection.Ge...When I try to run "NCLD2.LanguageDetection.GetLanguageDetectionScores("你好吗");" I get following error:<br />Additional information: Unable to load DLL 'Interop_x86.CLD2': The specified module could not be found. (Exception from HRESULT: 0x8007007E)<br /><br />Windows 7 64bit, Visual Studio 2015, .NET v4.6.1. <br />Changing console application's target to 64bit, or changing .NET version doesn't help.<br /><br />Do you have any ideas why it's not working? <br />Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-31167605067694478952015-03-24T13:29:29.067-04:002015-03-24T13:29:29.067-04:00Under CLD2 I see public/compact_lang_det.h ... it&...Under CLD2 I see public/compact_lang_det.h ... it's checked into source control (svn)?Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-31976397076170621992015-03-24T13:01:55.929-04:002015-03-24T13:01:55.929-04:00I followed the README: I sourced cld2/internal/com...I followed the README: I sourced cld2/internal/compile_libs.sh, which created cld2/internal/libcld2.so and cld2/internal/libcld2_full.so. I added cld2/internal to my LD_LIBRARY_PATH, edited setup.py and setup_full.py, and tried python setup.py build which failed:<br /><br />pycldmodule.cc:22:30: fatal error: compact_lang_det.h: No such file or directory<br /><br />There are files with similar names in cld2/internal e.g.<br /><br />compact_lang_det.cc, compact_lang_det_hint_code.h, compact_lang_det_impl.h, compact_lang_det_hint_code.cc, compact_lang_det_impl.cc, compact_lang_det_test.cc<br /><br />but no compact_lang_det.h.Anonymoushttps://www.blogger.com/profile/12180664756057215268noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-15283913166399597142015-03-24T11:35:28.600-04:002015-03-24T11:35:28.600-04:00Hi Márton,
You need to install and build cld2 fir...Hi Márton,<br /><br />You need to install and build cld2 first, which contains compact_lang_det.h ... see the README.Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-90185536637419798882015-03-24T11:26:18.808-04:002015-03-24T11:26:18.808-04:00If I understand it correctly, pycldmodule.cc tries...If I understand it correctly, pycldmodule.cc tries to include compact_lang_det.h, which is missing from trunk/internal. Could you please fix this or suggest a workaround? ThanksAnonymoushttps://www.blogger.com/profile/12180664756057215268noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-91975715224399331382015-02-11T10:34:56.350-05:002015-02-11T10:34:56.350-05:00Alas I probably won't push new releases to PyP...Alas I probably won't push new releases to PyPi: no time!Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-78203007176264184292015-02-06T15:50:27.469-05:002015-02-06T15:50:27.469-05:00This is great! Thanks for your time and effort. An...This is great! Thanks for your time and effort. Any plans to update the module in PyPi?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-68530073761529335972014-12-19T06:29:22.681-05:002014-12-19T06:29:22.681-05:00Alas I can't find the .NET bindings either ......Alas I can't find the .NET bindings either ... I remember seeing them at one point ...Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-91506984581392288842014-10-10T13:38:18.997-04:002014-10-10T13:38:18.997-04:00You mentioned .NET bindings. Are they available s...You mentioned .NET bindings. Are they available somewhere publicly? I haven't been able to find any.Nathan Ekstromhttps://www.blogger.com/profile/02486853729067819451noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-7409216192207128192014-06-08T04:19:11.855-04:002014-06-08T04:19:11.855-04:00Maybe try https://github.com/saffsd/langid.py ? B...Maybe try https://github.com/saffsd/langid.py ? But in general, working well on short text is challenging for any language detector.Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-1854760724812494072014-06-06T17:03:10.057-04:002014-06-06T17:03:10.057-04:00so what is the best library for small text lang re...so what is the best library for small text lang recognition?Casy_fillhttps://www.blogger.com/profile/02535414050535715901noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-6289185453413840002014-05-07T05:32:29.951-04:002014-05-07T05:32:29.951-04:00(call it = call the compiled file - compact_lang_d...(call it = call the compiled file - compact_lang_det_test_full)*Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-73299845545197330402014-05-07T05:31:20.528-04:002014-05-07T05:31:20.528-04:00Maybe you can try to compile compile_full.sh from ...Maybe you can try to compile compile_full.sh from "internal" folder and call it from command line using PHP? That's what I did with Perl...Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-31244418285516919392014-04-29T06:40:47.316-04:002014-04-29T06:40:47.316-04:00HI,
Thank you for this tutorial, it works perfectl...HI,<br />Thank you for this tutorial, it works perfectly !<br />question, how to interface with PHP ?<br />I searched the documentation, but I can not find anything with CLD2 …<br />when i test with python the file test.py, it's OK, but with php:<br />Fatal error: Class 'CLD\Detector' not found <br />an idea ?<br /><br />thx for your help.<br /><br />Eric<br /><br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-13593179195519146142013-10-03T06:25:48.883-04:002013-10-03T06:25:48.883-04:00Hi, I don't fully understand the question. Ca...Hi, I don't fully understand the question. Can you re-ask this, with more details, on java-user@lucene.apache.org?Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-56867657266211290402013-10-03T03:12:45.565-04:002013-10-03T03:12:45.565-04:00Hi ,
How to retrieve the matched tokens against th...Hi ,<br />How to retrieve the matched tokens against the Lucene Query fired from index file ...This is am doing for auto complete ...suggest me is there any alternative way for auto complete in lucene ...<br /> Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-70457110779187918532013-09-17T11:40:53.897-04:002013-09-17T11:40:53.897-04:00Indeed I see the same results as you; I think CLD2...Indeed I see the same results as you; I think CLD2 is just not designed for short text.<br /><br />You could try opening an issue at https://code.google.com/p/cld2/ and see what they say?Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-14486975587678984352013-09-17T09:25:40.611-04:002013-09-17T09:25:40.611-04:00Hi!
Today I spent some time to compare the quality...Hi!<br />Today I spent some time to compare the quality of few python libs for detecting languages from tracks name (really short text).<br />Unfortunately, according to my tests, CLD2 gave me really weird results. :(<br />Some examples:<br /><br />"Meadowlake Street" is PL<br /><br />cld2.detect("13 Años")<br />(False, 10, (('Unknown', 'un', 0, 0.0), ('Unknown', 'un', 0, 0.0), ('Unknown', 'un', 0, 0.0)))<br /><br />CLD2 was not able to detect language for "I love music" too.<br />Other libs are way slower than CLD2, but they gave me better results.<br />I'm wondering if something was bad with my installation, or someone else gets the same bad matching for the strings I posted here.<br /><br />I run my tests against millions of strings, not just few.<br />gnutonhttps://www.blogger.com/profile/18162736761250575420noreply@blogger.com