java - Launch Solr Indexing in Nutch Source Code -

June 15, 2014

i trying index nutch crawl solr, inside of source code, not command line.

i have created following function

public static int runinjectsolr(string[] args, properties prop) throws exception{            string solrurl = "http://ec2-x-x-x-x.compute-1.amazonaws.com/solr/collection1";      string crawldb = jobbase.getparam(args,"crawldb", null, true);     string segments = jobbase.getparam(args,"segments", null, true);     string args2[] = {crawldb, segments};      configuration conf = new configuration();     conf.set("-d solr.server.url",solrurl);     int code = toolrunner.run(nutchconfiguration.create(),             new indexingjob(conf), args2);     return code; }

but receiving following error:

2013-08-07 19:37:13,338 error org.apache.nutch.indexwriter.solr.solrindexwriter (main): missing solr url. should set via -d solr.server.url  solrindexwriter solr.server.url : url of solr instance (mandatory) solr.commit.size : buffer size when sending solr (default 1000) solr.mapping.file : name of mapping file fields (default solrindex-mapping.xml) solr.auth : use authentication (default false) solr.auth.username : use authentication (default false) solr.auth : username authentication solr.auth.password : password authentication

so assuming not creating configuration correctly. suggestions?

or should passing config field run different way? maybe not using

nutchconfiguration.create()

there 2 problems in code:

the solr.server.url must directly set in configuration object not -d option. given message nutch assumes running command line , misleading here.
as mentioned, passing 2 different configuration instances. nutchconfiguration.create() creates hadoop configuration internally , adds nutch specific resources don't need create yourself. also, toolrunner passes conf object indexingjob don't need pass constructor.

so correct code is:

configuration conf = nutchconfiguration.create(); conf.set("solr.server.url", solrurl); toolrunner.run(conf, new indexingjob(), args2);

Search This Blog

Copy

java - Launch Solr Indexing in Nutch Source Code -

Comments

Post a Comment

Popular posts from this blog

matlab - Deleting rows with specific rules -

asp.net - redirect .aspx with query string to html page using htaccess -

image - ClassNotFoundException when add a prebuilt apk into system.img in android -