• Coding
  • Java - HTTP Client help !

Hey guys,

I just finished my introductory course in java and im working on expanding what i've learned abit.

Anyway, Im creating a certain program that needs to fill out a form on a website and search.

Here is a website example : http://olibweb263.lau.edu.lb:7782/eolib263?sf_entry=dkvg&session=42091200&rs=&style=tiau&infile=presearch.glue&searcher=tiau.glue&sf_subentry=&sf_entry2=&sf_entry3=&nh=20&beforedate=&afterdate=&x=31&y=20

I just want to fill out the first two boxes and search, I don't even want to handle the server's response for now, I just want it to search. After ALOT of researching, it seems that HttpClient is able to do this, but ive read a gazillion tutorial and examples and still cant figure out how.

I would post my code, but it seems totally mute since I've only managed to "copy-paste" the basic things of HttpClient. Created the ArrayList needed and no freaking idea what to do next!

Any help would be really appreciated :)
Did you try to print out the reply that you're getting back? It would be html, you will then have to parse the html to get the needed data such as the book title.

The parameter that you have to include your query in is: sf_entry where sf_entry=programming for example. There seems to be other parameters as you can see such as befordat and aftedate etc...

Look at the source of http://olibweb263.lau.edu.lb:7782/eolib263?sf_entry=programming&session=42091200&rs=&style=tiau&infile=presearch.glue&searcher=tiau.glue&sf_subentry=&sf_entry2=&sf_entry3=&nh=20&beforedate=&afterdate=&x=31&y=20

, by looking at the DOM you find that each book title is within a span tag like this:
<span class='resultsbright'><A rel='nofollow' HREF='/eolib263?session=42091200&infile=details.glu&loid=141139&rs=480850&hitno=1'>ASP 3.0 programmer's reference   (c2000)</A></span>
Thats what you should be looking for after getting the reply from the server for that request. You have to search for every span node having the attribute class = "resultsbright" and that would have a child node <a> , get the value that is within the child and you will have your book title.

If you show us what you have so far, we can help with the code. Good luck.
Dude I can't even figure out how to use Http Client, I specifically noted that i just finished an INTRODUCTORY course for a reason :P This seems to be way out of my league but im still trying, heres what I managed to get :
  HttpClient httpClient = new DefaultHttpClient();
 
  HttpConnectionParams.setConnectionTimeout(httpClient.getParams(), 10000);
        			
  HttpConnectionParams.setSoTimeout(httpClient.getParams(), 10000);
        			
  HttpPost httpPost = new HttpPost(http://olibweb263.lau.edu.lb:7782/eolib263?sf_entry=programming&session=42091200&rs=&style=tiau&infile=presearch.glue&searcher=tiau.glue&sf_subentry=&sf_entry2=&sf_entry3=&nh=20&beforedate=&afterdate=&x=31&y=20);  
        			
 List<NameValuePair> nameValuePairs = new ArrayList<NameValuePair>();  
        			
 nameValuePairs.add(new BasicNameValuePair("name1", "value1"));  
        			
 nameValuePairs.add(new BasicNameValuePair("name2", "value2"));
        			
 httpPost.setEntity(new UrlEncodedFormEntity(nameValuePairs));
        			 
 HttpResponse response = httpClient.execute(httpPost);
I cant even figure out how to use "name1" and "value1" or what to replace them with.

Sorry for being a total beginner, I guess I'm aiming too high, but I always do that and seem to succeed haha
Check out this sample code snippet that does fetch the page, puts the HTML in a String object and shows the page:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLEncoder;

public class TestHttp
{
  public static void main(String[] args)
  {
    try
    {
      String category = "programming"; 
      
      /* Your URL */	
      String myUrl = "http://olibweb263.lau.edu.lb:7782/eolib263?sf_entry="+category+"&session=42091200&rs=&style=tiau&infile=presearch.glue&searcher=tiau.glue&sf_subentry=&sf_entry2=&sf_entry3=&nh=20&beforedate=&afterdate=&x=31&y=20"; 
  
      String results = getPage(myUrl);
      System.out.println(results);
    }
    catch (Exception e)
    {
      System.out.println(e.getMessage()); 
    }
  }
  
  private static String getPage(String urlString) throws Exception
  {
    URL url = null;
    BufferedReader reader = null;
    StringBuilder stringBuilder;

    try
    {
      // create the HttpURLConnection
      url = new URL(urlString);
      HttpURLConnection connection = (HttpURLConnection) url.openConnection();
      
      // just want to do an HTTP GET here
      connection.setRequestMethod("GET");
      
      // give it 15 seconds to respond
      connection.setReadTimeout(15*1000);
      connection.connect();

      // read the output from the server
      reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
      stringBuilder = new StringBuilder();

      String line = null;
      while ((line = reader.readLine()) != null)
      {
        stringBuilder.append(line + "\n");
      }
      return stringBuilder.toString();
    }
    catch (Exception e)
    {
      e.printStackTrace();
      throw e;
    }
    finally
    {
      if (reader != null)
      {
        try
        {
          reader.close();
        }
        catch (IOException ioe)
        {
          ioe.printStackTrace();
        }
      }
    }
  }
}
You have to concentrate on the getPage() function here and read the comments for the statemets and try to understand what each actually does. Notice the loop that reads from the stream line by line and fills a string object which is then returned.

What you can do is after getting that string of HTML is parse the HTML depending on what you want to find in their. I suggest you research DOM parsing in Java, check this thread too http://stackoverflow.com/questions/1497946/how-can-i-parse-a-html-string-in-java . Good luck.
How is the code you gave me helpful before searching ?

From what I understood, I search then convert the resulting webpage's source into a string in java, and then parse that String to get the info I need. Correct ?

If yes, then how would I search from the first place ? And I'd rather not use the URL to search but actually input information into the form and executing ( I think thats what Http Client does no ? )
And I'd rather not use the URL to search but actually input information into the form and executing ( I think thats what Http Client does no ? )
No thats not exactly what an HTTP client does, an HTTP client does an HTTP request on a server Get or Post just like the form on a certain page does. It would be complicated for nothing to actually fill the form on the page itself and submit it from there to the server.

I suck at explanation but here is a try. Lets say you have a form on a page index.php and you have another file on the server called search.php , normally a user would access the index page, fill in data and search via that form that actually does a request(post or get) on search.php file on the server and that file after executed will return results.

Now lets say you wanted to do the same thing from your program that you want to name "HttpClient". All you have to do is directly request the search.php page by sending exactly the same parameters that the form on index.php would have sent. The index.php "form" is now somehow mimicked by the your program. Get it?
From what I understood, I search then convert the resulting webpage's source into a string in java, and then parse that String to get the info I need. Correct ?
Yes you can do that.
Now lets say you wanted to do the same thing from your program that you want to name "HttpClient". All you have to do is directly request the search.php page by sending exactly the same parameters that the form on index.php would have sent. The index.php "form" is now somehow mimicked by the your program. Get it?
Ok Ayman, I completely understand you, in theory. But how the hell would I mimick the index.php ?
Man the code I gave you is an example of how you can do an HTTP request on the page you want. This is the block you have to understand and all statements are commented.
// create the HttpURLConnection
url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();

// just want to do an HTTP GET here
connection.setRequestMethod("GET");

// give it 15 seconds to respond
connection.setReadTimeout(15*1000);
connection.connect();

// read the output from the server
reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
stringBuilder = new StringBuilder();

String line = null;
while ((line = reader.readLine()) != null)
{
stringBuilder.append(line + "\n");
}
If you don't get what HttpURLConnection or BufferedReader or any other class please research them and try to understand what they do. What this code is doing is establishing a url connection with the page, and reading whats returned from that connection via BufferedReader which reads the input stream coming from the connection. Can't simplify the procedure more than that.
Dude I understood all of that a long time ago, and I know what everything does in that code ! But what you provided is useful AFTER I search, what I want to do is to be able to search from the program! I think I got what you said about mimicking, I just need to figure out what to replace in "name1" and "value1" when using Http Client from the original code I provided.

I'll give it a try, thanks for the clarification.
I think you still don't get it mark.
HttpClient doesn't see a page. It doesn't fill a form.
HttpClient does send an http request to an http server and receives the result as a string. This string could be html, xml, json, plain text, an image, or some binary blob.
When you use a browser and fill a form, then press submit to search this is what happens (an over simplification):
- The GUI (the browser graphical interface, buttons and textboxes) handle your input (mouse and keyboard).
- When you click search, the GUI collects this input, encodes it as a string (var1=here&var2=there) and passes it to an HttpClient (here is the over simplification, as browsers can do this differently, but it would be equivalent) that sends your request to the server.
- When the server responds, HttpClient receives that response, checks the type of the response, and deals with it accordingly (download, page refresh or redirect...)

What you want to mimick is the HttpClient side of the thing. Don't expect HttpClient to fill a graphical form for you because it knows nothing at all about such insane things. It sends strings, receives strings. End of story.

If you want to use it, you'll have to create the request string that is passed to the HttpClient yourself (based n what the server expects, which you can inspect by looking into the html form). When you receive the response, you can save it to a file and then view it in the browser, or you can parse it and do crazy java stuff with it.

A rookie mistake when someone is helping you is to say: I completely understand. What the hell are you asking for.
What do you mean by name1 and value1 ? Parameters? Just concatenate your parameters(the values you're going to use in the actualy form) in the url string like how it's done here with the sf_entry param.
String myUrl = "http://olibweb263.lau.edu.lb:7782/eolib263?sf_entry="+category+"&session=42091200&rs=&style=tiau&infile=presearch.glue&searcher=tiau.glue&sf_subentry=&sf_entry2=&sf_entry3=&nh=20&beforedate=&afterdate=&x=31&y=20";
Is that what you're asking for or is it something else?
@ arithma : I did understand what you said when Ayman explained it, but your talking theory, HOW do I know what the server expects to create the request string that is passed to the HttpClient ?

You said by looking into the html form, according to Ayman :
Thats what you should be looking for after getting the reply from the server for that request. You have to search for every span node having the attribute class = "resultsbright" and that would have a child node <a> , get the value that is within the child and you will have your book title.
But for a beginner it's hard to convert plain english to code ive never seen before, I really did understand the concept behind it.

@ Ayman, again thanks alot for your help,

By name1 and value1, I am talking about the parameters of " nameValuePairs.add(new BasicNameValuePair("name1", "value1")); " in the code I posted in post #3 .
HOW do I know what the server expects to create the request string that is passed to the HttpClient ?
The server is expecting the parameters that are supplied within the URL as it's a get request.

This is the url for example:
http://olibweb263.lau.edu.lb:7782/eolib263?sf_entry=programming&session=42091200&rs=&style=tiau&infile=presearch.glue&searcher=tiau.glue&sf_subentry=&sf_entry2=&sf_entry3=&nh=20&beforedate=&afterdate=&x=31&y=20
And so these are the parameters expected to have values: sf_entry, session, rs, style, infile, searcher etc.. when you say in the url: sf_entry=someValue this means you're assigning someValue as the value of the parameter: sf_entry where sf_entry's value is expected by the server.
You said by looking into the html form, according to Ayman:
By looking at the form. you check it's Action attribute, to which page it is directed at and then do your HTTP requests on that file from your http client. i.e. if the form is like this: <form method = "get" action = "post.php"> you know that you will be doing your http request against post.php file on the server.