Hi, I'm having trouble using file() to get the text of web pages for an in-house URL directory and search engine I'm trying to build. I'm using something along the lines of the following code to take a user-submitted URL and then fetch the text of the page at that address and store it in a database for later searching. Most URLs work fine, but really long ones with long query strings for some reason cause file() to throw up an error. This URL used in the code below, for example, causes the error below, but most any other URL works just fine. Any ideas?
Thanks, Andy //The code $_POST['url'] = 'http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&th readm=v0401171cb70ca1615b5a%40%5B192.34.169.24%5D&rnum=1&prev=/groups%3F q%3D%252Bphp%2B%252Breturn%2B%252Bkey%2B%252Bform%26hl%3Den%26lr%3D%26ie %3DUTF-8%26oe%3DUTF-8%26safe%3Doff%26selm%3Dv0401171cb70ca1615b5a%2540%2 55B192.34.169.24%255D%26rnum%3D1/'; $_POST['url'] = eregi_replace('http://','',$_POST['url']); $_POST['url'] = eregi_replace('/$','',$_POST['url']); $full_url = 'http://' . $_POST['url'] . '/'; if(!$body = file($full_url)){ //echo an error message } else { echo $body; } Causes: Warning: file("http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=o ff&threadm=v0401171cb70ca1615b5a%40%5B192.34.169.24%5D&rnum=1&prev=/grou ps%3Fq%3D%252Bphp%2B%252Breturn%2B%252Bkey%2B%252Bform%26hl%3Den%26lr%3D %26ie%3DUTF-8%26oe%3DUTF-8%26safe%3Doff%26selm%3Dv0401171cb70ca1615b5a%2 540%255B192.34.169.24%255D%26rnum%3D1/") - No error in c:\program files\nusphere\apache\htdocs\newslogic_dev\url_directory\test.php on line 11