|
本帖最后由 邱海波 于 2022-9-6 21:47 编辑
===========================================
*提示:如本帖关闭,无法回复,交流讨论,敬请移步专帖。
===========================================
(1)2019.7.14,更新英文喂鸡文库20190701文字试用版:
下载地址:百度网盘 https://pan.baidu.com/s/1sCvtNOi0bpvbZdJCRvAJsA 提取码x5ni
===========================================
(2)2019.4.21,更新英文喂鸡百科20190401文字试用版:
(一).制作说明:大家好,时隔两年十个月,应大家积极要求,Mdict版本英文喂鸡百科再度更新。
- 英文喂鸡百科20190401
- 数据版本:2019年4月1日
- (1)制作信息:
- ·词条:14589846词条,416901公式
- ·日期:2019年4月21日
- ·数据:http://dumps.wikimedia.org/enwiki
- ·工具:wikicafe 1.0 & Mdxbulider 3.0 beta2
- (2)更新日志:
- ·2019/4/17:20190401数据第一个版本
复制代码 a.简介:
采用最新数据,使用wikicafe引擎转换制作。页面效果尚可,总计1459万词条+41.7万个公式内容。mdx文件总计14.9GB,mdd文件总计186M(X5)。
b.制作说明:
#起因:制作英文喂鸡百科的起因,是因为几个论坛的朋友问我除了中文百科,能不能制作英文喂鸡。因为没有经验,且耗时耗力巨大,没想到自己这么有耐心和恒心,竟然完成了英文喂鸡百科词典的制作。
#过程:词典源数据文件.bz2约14.9 GB,下载约一天半。不解压,直接采用wikicafe引擎处理文本约一天,得到约56GB txt数据文本。再分割成6GB文本文件10个,
单个6GB文本文件处理首尾行词条内容,然后解决数百万个词条内链问题(详见)。2个6GB文本文件合并为12GB文本文件,这样共5个大文本文件:5个12GB。5个大文本文件再经mdxbulider处理,于是便是得到5个mdx,暂称为分卷一至五。对应5个mdd由于用的同一data数据文件夹,故内容、大小完全一致。下载时下载一个即可。
转制mdx、mdd约一天。上传耗时约两天,大约15.8GB文件。
#其他:
wikicafe引擎处理效率高,但对词条内容模板形式几乎未作处理,所以可能导致某些页面尤其是人名词条排版效果较差。介意效果者请勿使用。
该版本为文字试用版,无图有公式,仅供试用!不保证后续能对词条页面效果有所改善和及时更新。
效果图:(点击图片可以查看大图)
(二).下载地址:
英文喂鸡百科20190401:一共五个分卷mdx,001-005。搜索时联合使用。
下载地址: 百度网盘https://pan.baidu.com/s/1fVCKI-Ot_XZIAroceqp9RA 提取码29hi
制作十分不易,欢迎大家积极赞助!
(三).联系本人:
邮箱[email protected]
(四).输出日志:
MDX:
1、
- Begining loading source file...
- Done
- Time used for this section: 181 seconds
- Sorting dictionary...
- Done!
- Begin processing index...
- Done!
- Original index size = 32516KB, compressed size = 12471KB, compression ratio = 38%
- Time used for this section: 20 seconds
- Begin processing data contents...
- Done!
- Original text size = 12507470KB, compressed size = 3656348KB, compression ratio = 29%
- Time used for this section: 2396 seconds
- Number of entries: 1265107
- Conversion succeed!
复制代码
2、
- Begining loading source file...
- Done
- Time used for this section: 249 seconds
- Sorting dictionary...
- Done!
- Begin processing index...
- Done!
- Original index size = 78394KB, compressed size = 29495KB, compression ratio = 37%
- Time used for this section: 47 seconds
- Begin processing data contents...
- Done!
- Original text size = 12390924KB, compressed size = 3428278KB, compression ratio = 27%
- Time used for this section: 2786 seconds
- Number of entries: 2886884
- Conversion succeed!
复制代码
3、
- Begining loading source file...
- Done
- Time used for this section: 66 seconds
- Sorting dictionary...
- Done!
- Begin processing index...
- Done!
- Original index size = 110349KB, compressed size = 40231KB, compression ratio = 36%
- Time used for this section: 6 seconds
- Begin processing data contents...
- Done!
- Original text size = 12317878KB, compressed size = 3204491KB, compression ratio = 26%
- Time used for this section: 1115 seconds
- Number of entries: 3904764
- Conversion succeed!
复制代码
4、
- Begining loading source file...
- Done
- Time used for this section: 69 seconds
- Sorting dictionary...
- Done!
- Begin processing index...
- Done!
- Original index size = 105024KB, compressed size = 38623KB, compression ratio = 36%
- Time used for this section: 7 seconds
- Begin processing data contents...
- Done!
- Original text size = 12324338KB, compressed size = 3069390KB, compression ratio = 24%
- Time used for this section: 1044 seconds
- Number of entries: 3640568
- Conversion succeed!
复制代码
5、
- Begining loading source file...
- Done
- Time used for this section: 215 seconds
- Sorting dictionary...
- Done!
- Begin processing index...
- Done!
- Original index size = 88722KB, compressed size = 31612KB, compression ratio = 35%
- Time used for this section: 46 seconds
- Begin processing data contents...
- Done!
- Original text size = 8721117KB, compressed size = 2117537KB, compression ratio = 24%
- Time used for this section: 1737 seconds
- Number of entries: 2892523
- Conversion succeed!
复制代码
MDD:
- Begining scaning data directory ...
- Done
- Begin processing data file index...
- Done!
- Original index size = 36641KB, compressed size = 11521KB, compression ratio = 31%
- Begin processing data file contents...
- Done!
- Original text size = 219289KB, compressed size = 179648KB, compression ratio = 81%
- Number of entries: 416901
- Conversion succeed!
复制代码 ===========================================
|
评分
-
8
查看全部评分
-
本帖被以下淘专辑推荐:
- · 精排|主题: 166, 订阅: 53
- · 字词典|主题: 4, 订阅: 1
|