PDA

View Full Version : Anyone know about HTML Get requests?


Kaiyodo
10-12-2002, 01:53 AM
My Magelo converter is playing up 'cause Magelo is giving me my data in gzipped format rather than the uncompressed format it was giving me last week.

The request I send is ..

GET
/eq_view_profile.jsp?num=123895
HTTP/1.1
Host: www.magelo.com

I've tried putting "Accept-Encoding:" in there too but it still delivers gzipped data.

Anyone know how I can get the raw uncompressed HTML data? Or decompress the gzipped stuff. Running 'uncompress' from zlib on the data just gives me a corrupted data error.

K.

curtdept
10-14-2002, 04:44 PM
Webservers tend to gzip outgoing data these days to save on bandwidth. Browsers tend to treat this as a code behind more then anything. Not sure but I see your using the .NET framework, you may be able to parse it out using an Common Internet Explorer control rather then a statement like get. Could have IE Retrieve then save as HTML or another type for file format temporarily then parse out the data from the temp file. Just a thought. Could also run the get through a gzip compression library as well. Both would take a little work, not sure which one would get you there faster :)

-Curtis

Kaiyodo
10-15-2002, 04:52 AM
Although I'm using Visual Studio .NET I'm not actually using the .NET framework, it's win32 API all the way. I've given up on trying to get uncompressed HTML from the server, my current idea is to try and and instance an MSHTML control and get that to give me the source to the page.

I've tried sending the compressed data to the gzip library for decompression but it just refuses to believe it's valid data. I even went to the trouble of downloading an open source webbrowser and putting in it's decompression code, still didn't work.

Thanks for the tips though. Someday I'll get it wokring again :)

K.

Kolo
10-23-2002, 06:25 PM
The get request should be formatted correctly. I have an application at my work I created that sends Posts and does Gets on the Web server for user tracking. here is the format I used....

GET /path/file.htm HTTP/1.1 <cr>
//(I believe the rest is optional but recommended, especially the Accept: ...not positive though)
User-agent: MyProgramName <cr>
Accept: */* <cr>
<cr>
Place 2 carriage returns at the end, a blank line seems to key servers that you are complete with the request.

I used the User-Agent Field to send my application name so when we parse the logs, we can find out what percentage of total hits are from that app. You can also request the data as Text by putting:

ACCEPT: text/html

instead of the */*. This may be why your getting the gzip data

Can you PM me here in the forum a snippet of your code for accepting data? Maybe you are running into the same issue I had on variable declarations.

Kaiyodo
10-26-2002, 05:45 AM
Just to let you know, I've got this working now. Swapped from raw HTTP GET requests to Windows' URLMoniker COM interface which does the decompression for me :)

K.