Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Wikimedia .jar file..

Status
Not open for further replies.

youradds

Programmer
Jun 27, 2001
817
GB
Hi,

Not really sure if anyone will have any ideas.. but I thought it was worth asking =)

I'm trying to import the Wikipedia dumps from However, I'm having problems with the "limits" in place.

For example, using this command;


wget java -server -jar mwdumper.jar --format=sql:1.5 20051127_pages_articles.xml.bz2 > dump.sql

..gives;

user@undevmac wiki $ java -server -jar mwdumper.jar --format=sql:1.5 20051127_pages_articles.xml.bz2 > dump.sql
1,000 pages (231.803/sec), 1,000 revs (231.803/sec)
2,000 pages (260.146/sec), 2,000 revs (260.146/sec)
Exception in thread "main" java.io.IOException: Parser has reached the entity expansion limit "64,000" set by the Application.
at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
user@undevmac wiki $

Anyone had any experience with this, and how to solve/raise the limit?

TIA!

Andy
 
OMG.. you're a star! I did read about that, but tryied putting it in the wrong place *blush*

i.e;

java -server -jar mwdumper.jar -DentityExpansionLimit=100000 --format=sql:1.5 20051127_pages_articles.xml.bz2 > dump.sql


..instead of ;

java -server -jar -DentityExpansionLimit=100000 mwdumper.jar --format=sql:1.5 20051127_pages_articles.xml.bz2 > dump.sql

I'm a perl/MySQL type of person.. so this JAVA stuff is all very new to me (in fact, this is the first JAVA application I've ever tried using from command line =))

Thanks again :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top