PHP CURL Tutorial – Web Scraping & Login To Website Made Easy

PHP CURL Tutorial – Web Scraping & Login To Website Made Easy


In this video you will learn how to use
PHP CURL, we are going to do three different examples in the first one we
will scrape the results from Google in the second one from my personal website codingpassiveincome.com and then in the third example I’m going to show how to
log into some website so stay with me 🙂 Hey what’s up guys Senaid here codingpassiveincome.com the place where I help others to become a web developer
much easier and faster than they will do it on their own so if that’s something
that interests you, consider subscribing –>So for those of you that don’t
know what curl is basically we use it mainly from PHP when we want to
communicate with different web sites so if you want to get an information from
some website we usually use curl because of its speed so in the first example
that I mentioned in the introduction I said that we are going to scrape the
results from the Google so first thing as you can see if we go to the Google
and then try to search for something let’s say test whatever you can see that
they always use Google dot and then the name from the country that you are
coming it mostly then search and then here they have a pendant of a lot of
different things but the one that we are interested in is this query basic this
query and then what you are searching for so basically we can delete all the
other things like this and then if we make a test let’s say another test you
can see now that it is working and that’s what we are going to do so that’s the
URL that we want to scrape so I will open PHP tags and first that we need to
do is to actually in it call so I’ll create variable curl handler like
shortcut and then I’ll say curl_init and that’s it and now for this curl we
always need to add some options depending on what you are going to do so
you will see through this examples what are the different options
that you have so let’s say the first one that you are going to do so as you can
see first parameter is the curl handler so just right here your
variable name and then the option that you want to set so let’s say this time
we will just set URL and then here we need to specify the URL okay so now our
URL is this one okay and then here are the curl
we just need to specify the keyword that we are searching for so let’s say
Senaid Bacinovic and then I usually use plus just like
URL encoding so the next that we need to set for this example is and then
follow location which means that if Google redirects us from this URL to
some other that they want mainly they do it if you are choosing the wrong domain
so we yes we want to follow location and then we also one that we get the return
from in Google so it says return transfer at one so here I set one you
can set true or 0 or false so it’s saying so true or one and false
for zero whatever you want okay and I think that’s it let’s make a test
so now to make a test we just need to declare a variable where
we will store whatever Google returns so you’ll say call and then execute and
then we want to choose this curl Handler and of course you are going just to
close this handler close the connection okay and then
let’s just the echo response so let’s see what will happen
okay I’ll hit refresh and it works here we go
so as you can see now we have get all the results for this keyword and now if
we do the same thing here and search for me here you guys can see Twitter linkedin
and in my website etc etc so this is how to get but now the thing is let’s see
how we are going to parse for example all texts from this so that you get only
them because we are scraping the google for some reason and now there are
different ways on how we can scrape some website
I mean how we can analyze the return on response HTML document and go through it
and for that reason I want to show you how to use simple dom parser okay it’s
one very lightweight but very powerful library so we will just dowonload it
okay latest version as you can see is just few kilobytes it’s very small but
very very powerful and with this library you basically are able to search through
this document just like you will do in jQuery which is really powerful powerful
okay so just extract here delete the archive and this file there’s any
includes so I don’t think that we need any of those we doesn’t need this part
but we will see and that’s it so at the top I will
include simple HTML dm PHP okay and then here of course we need to utilize
so let’s say i will create HTML variable out there is no variable and then I will just utilize this simple
HTML Dom and then I’m just going to load the response from Google and then let’s
see will we get any error so I hit refresh and it’s empty
okay I’m not sure why so let’s try to say he sorry it’s okay
we didn’t print out anything so that’s why it’s empty that means that there
isn’t any error and then now let’s try to search for the titles so here let’s
see our source code this is from the Google and then let’s try to search for
the one and if we go again to the Google you can see that first one is for the
Twitter so let’s find it so say twitter.com okay and here we go so you
can see here that we have one link and then in that link we also have the title
that we are looking for like here so we can see that every link is starting with
href and then /url question mark and then some URL so we can say something
like here for each and then let’s say HTML fine and then we will do a selector
like we will use in jQuery so say ref is
starting with then /url question mark and then let’s say as link we want to
get to print out link and then I will just say inner or maybe plain text so
you have different options here to extract and you can find it if you go
here and then maybe go to the online document about simple dawn and then here
of course you have explanation on different things and how you can use it
okay so just read the documentation if you want to learn more okay and let’s
make a test for our case scenario and here we go so maybe let’s just add a new
line okay and here just in six link it linkedin twitter programmer and blogger senaid and
as you can see here we have something weird so I’m not sure what this module this is on Bosnian language and
let’s try to find here so it’s here and let’s see we can say that if we have
found URL with this web page that Google user contact.com we don’t want to
include it so maybe we can say here if string position and then let’s say in
our link then let’s see if we find
where catch dot Google etc equally false that means if we don’t find your just
print out this text so let’s see and it’s gone so now we have got only the
results about me in this search which is what you were looking for and now if you
know how to use this simple HTML Dom parser you will know how to extract
anything from that page okay so that is the first example so let’s go to another
one in the next one we are going to try to get the informations from my personal
website so this is my website and here are all the tutorials that I have
there’s a lot of them and let’s try to get and see what will happen so I will just paste the link here so let’s we will use a cell and we will just echo the response and
here will will say exit just test this part here okay and now if you see there isn’t any
tutorial this is just something about my website so the problem is that if you go
to the f12 hit refresh you can see that I’m dynamically loading all the
tutorials so if we may maybe go down and click on the load more you can see here
that I’m going to make and Ajax call and then with a ajax
call I’m going to load more tutorials you know and if we do it again same
thing will happen ok so now it means that we will need to execute the post
request to this URL in order to get more tutorials and for that reason we will
need to modify this script a little bit so I’ll say ajax X dot PHP and now
here you can see that in the phone data I’m sending four different things so we
will need to send them too so let’s say here I will create one variable post
files and then create an array okay so this is associative array and it means
that we need to do it like this can be boring sometimes but okay and now
as you can see I have some start variable and some limit okay so let’s
say I am starting from zero and want to get 10 tutorials okay and here in our
curl now we need to say to the curl that we are going to make a post request
so for that reason we will say here curl Handler and then call off and then let’s
say post files will be our pots files but I will just use and build HTTP query
so say postfiles okay so I will duplicate quickly remove this and then say we are
making post and setting it to the one okay and let’s make a test now so
refresh and as you can see now we are getting only the posts from my website
and if we go to control you there’s a lot of HTML so maybe let’s say a echo
free so the problem is that this is not
nicely sorted because you can see here we have Jason so I will just say here
that you’ll say JSON decode and then here I’ll say response race which is
this here okay so let’s now refresh and here we go so now here you can see
that we have each time different columns and in each column we are getting
different tutorials okay and now we can see here that for every tutorial I do
have one h2 two box title and in this title they will have a link and that
link we are going to have a title of this tutorial so again we are going to
use our simple HTML Dom okay so here I’ll say response recent and then here
we will search for h2 and then we can just copy the class from here and then we will remove this part and
say a echo link plaintext okay and let’s you here we go so as you
can see now even from my own website I have got all the tutorials but now the
difference is that we needed to add post and post fields to this call or it
wouldn’t work and now the last thing that I want to do for this tutorial is
to show you how to log in to some website so let’s say we will again use my
website as an example so in my website I do have a special members dashboard
those are the people that have bought some course from me and now this is the
login page to get an access to some of those courses as you see now here this
drop down you are able to select one of the courses okay and let’s say we are
going to use this one and then we need to provide email and the password okay
so I have set some temporary password enter and if we log end it will show the
message and login so now it’s obvious that this is done over jQuery AJAX
because we didn’t refresh page and we have logged in
so for that reason we will need to know what are the information that you are
standing and we are sending those so of course if you know that it is a AJAX code
then just go to the source code and see what’s happening there and as you can
see here the code is pretty simple and in Ajax we are sending the
for informations which is check login as a key email password and the course
ID and course ID we are getting from this select and we just get an option so
let’s do it I will just get this or maybe we can just quickly write password and
then course ID so email is mine okay that’s what we enter and then course ID
is for this one is number one okay and that’s it so I will save and then we need
to set our URL okay so URL we were send information is login.php which is this
URL so how I will just copy okay and that’s it so let’s again see what will happen and
here I will just say X says sto could not be connected to
string that’s okay let’s remove this at moment and here it goes as you can see we have
got JSON response status is 1 and the first lesson is number one which means
that we have successfully logged in but the thing is that we want to see the
members dashboard and not only this message so that means that we need to
store somewhere the cookie informations and for that reason there is another
option for the call okay call handler call and then we need cookie and then we
need to set the path to that file so let’s say I’ll just say in my case
cookie dot txt okay and now here our I will just say decode this and then I am going
to check so see if response status that means that we are logged in and we are
going to make another call handler another call to get this let’s see where
will it redirect us as we going okay so if we are logged in we are going to make
a test and try to get this page so here I will just remove this because I don’t
want to close this curl that’s very important and then here I will just paste
everything and then post and post fields we don’t need next time we just need
this URL okay okay
and this ID is basically from response first lesson okay and of course you don’t
need JSON decode because now we are turning the entire HTML okay and let’s
not try to get everything from this responds to seem to be logged in
successfully so here I’ll say refresh and here we go as you can see now we have
from the left side in menu that you should have like this and we have here
some content for the first lesson whatever it might be and if we refresh
here is the source code that we are getting okay which is amazing we did
correctly this part and now the next thing that I want to do is let’s
actually try to get the menu items from this dashboard so let’s see how we can
get this those and as you can see here the each time have this link that
which href is dating with view lesson dot PHP so we can say here right and
then we’ll just say starts with view lesson dot PHP IDE okay and let’s make
up test object what is the problem here we
should flow the reason to just need this response and as you can see now we have
all the links from our page so guys pretty much that’s it I hope that you
have enjoyed this video and if you do please like it and share with your
friends and of course if you have any question doubts or anything please ask
in the comments below take care 🙂

88 thoughts on “PHP CURL Tutorial – Web Scraping & Login To Website Made Easy”

  1. That’s really great 👍🏼. BTW, do you work as freelancer?! If yes, I have paid tasks for you to do. Plz contact me if you are interested. Thank you.

  2. Hey
    I Tried php curl for a website form but it show acces denied
    can you code for me
    URL: http://peb.mp.gov.in/results/RESULT_17/PRT_RES17/FINAL_RESULT/default_results.htm
    for testing input detail:
    Search Roll No. (Max.9 digits): 752563302
    Date of Birth (DD/MM/YYYY): 12/10/1988

    Please help me
    code for it and send to me at [email protected]

  3. Your registration page has some error . I am not able to register !!

    https://codingpassiveincome.com/students/register.php

  4. Hi, can you explain how can I get result for recent FB sponsored ads by creating a search query?.. What is the logic behind that kind of search query, If it is through graph-api then kindly send me the link to the documentation..example of the search is: adespresso[dot]com

  5. thank you bro ,please can'i get data from my Bing ads acount using this ,because i tried but it didn't worked form me ,can you help me please

  6. Thank you sir, i re ally appreciate your videos. Please can you make a proof of payment tutorial using php. Thanks Sir.

  7. Hi, I subscribe, to you cannel, and can you say me how can I send an whatsapp mensage from my website, using the curl, I want to enter the pc whatsapp site, log with my phone, and send the mensage for one of my friends, using my website, can you help me? thanks.

  8. Getting Error when login with google

    Fatal error: Uncaught Google_Service_Exception: { "error": { "errors": [ { "domain": "global", "reason": "authError", "message": "Invalid Credentials", "locationType": "header", "location": "Authorization" } ], "code": 401, "message": "Invalid Credentials" } } in /storage/ssd3/927/5020927/public_html/GoogleAPI/src/Google/Http/REST.php:118 Stack trace: #0 /storage/ssd3/927/5020927/public_html/GoogleAPI/src/Google/Http/REST.php(94): Google_Http_REST::decodeHttpResponse(Object(GuzzleHttpPsr7Response), Object(GuzzleHttpPsr7Request), 'Google_Service_…') #1 /storage/ssd3/927/5020927/public_html/GoogleAPI/src/Google/Task/Runner.php(176): Google_Http_REST::doExecute(Object(GuzzleHttpClient), Object(GuzzleHttpPsr7Request), 'Google_Service_…') #2 /storage/ssd3/927/5020927/public_html/GoogleAPI/src/Google/Http/REST.php(58): Google_Task_Runner->run() #3 /storage/ssd3/927/5020927/public_html/GoogleAPI/src/Google/Client.php(788): Google_Http_REST::execute(Object(GuzzleHttpClient), Object in /storage/ssd3/927/5020927/public_html/GoogleAPI/src/Google/Http/REST.php on line 118

  9. I was looking for some CURL tutorials, and luckily found your channel! So quick and informative, subscribed immediately! Keep up the good work, and looking forward to new videos! Hvala! 😀

  10. Hi. thanks for share. I have a problem … when i try to login . I recibe this message. Unauthorized Request Blocked

  11. So you send by COOKIEJAR some cookies values that are created after each login session? What was in txt?

  12. Hi Senaid, I'm just still getting a login page and not getting logged in. Can I mail you the code instead of showing it here (because the login info is in it)?

  13. Google will prevent you from curl-scrapping and web crawling and attempt to block you using your internet IP Address if you are abusing request over its search service , how to resolve this ? Simply Delete cookies used by Google or change IP Address … N.B : Using Private or Anonymous IP Address does not resolve the situation , so don't thin about it

  14. can you help me please, i love your videos on yt.
    my web example http://vidplayer.online/hi.php
    hi am using this code
    <?php
    /* ——————————SCRAPING DATA USING THE CURL FUNCTION IN PHP —————————————————*/
    //initialize curl
    $url = curl_init('http://123netflix.biz/movies-list.html');
    curl_setopt($url, CURLOPT_RETURNTRANSFER, TRUE);
    $page = curl_exec($url);
    // check for execution errors
    if(curl_errno($url))
    { echo 'Scraper error: ' . curl_error($url);
    exit;
    }
    //Use Regex to pull only some data from page
    $regex = '/<div class="row top">(.*?)<div class="pagination-wrap">/s';
    if ( preg_match($regex, $page, $list) )
    echo "<klk>" . $list[0] . "</klk>";
    else
    echo "Not found";
    curl_close($url);
    ?>

    how can i get the links without, Im Getting this <a href="https://gomovieshd.se/movie/hearts-beat-loud-2018-2/" class="ml-mask jt" data-url="ajax/movie_details/13371.html" data-hasqtip="240" oldtitle="Hearts Beat Loud" title="Hearts Beat Loud" aria-describedby="qtip-240">

    but i wanna get the link without the https://gomovieshd.se, like this href="/movie/hearts-beat-loud-2018-2/" class="ml-mask jt" data-url="ajax/movie_details/13371.html" data-hasqtip="240" oldtitle="Hearts Beat Loud" title="Hearts Beat Loud" aria-describedby="qtip-240">

  15. Excellent tutorial. I follow all your tutorials 🙂
    Can you please create some tutorials on Google Maps, Places, and Geocoding APIs with PHP/JS ?

  16. nice tutorial thanks for sharing. FYI – my linux curl didn't need to define a "value" parameter. It would give me an error. This worked ==> curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1).

  17. hi, i have a question, what if i want to scrape the url with "containing" function?

    for example the website is blablabla.com, almost every links start with blablabla.com
    and the links i want to scrape contain the word "trkid", how can i do that? what should i add after the "href"?

  18. I am planning to use your course TODAY in an exercise to include dedicated proxies and anti-captcha. Do you use a specific cloud host? I am using Amazon EC2 and Workspaces, do you have any recommendations on a cloud host for running data extraction scripts?

  19. Sir I need help while Scraping some website block me from scraping but i need to scrap those website can you please do me a favor and help please reply me asap Waiting Thanks Regards Bilal Khan

  20. I was with you up until you started using jQuery. Why not just use plain VanillaJS? I hate jQuery, as do countless employers now.

  21. hi,your video is very good. I have a suggestion for you: can you set in description time codes(if you tell about 2+ themes)

  22. Would you please make a tutorial for paypal checkout like stripe you made. Thanks for making that video

  23. Hi, I am japanese. So, I am not good at English sorry.

    By the way, I have question.

    It appears in about 19minutes and 4seconds of this tutorial, You sucess login and It appears in about 21minutes and 37seconds, You get view-sourece {"status":1, "firstLesson"=1}.

    But, I can't log in even if I do the same, and I get {"status":0} instead of {"status":1, "firstLesson"=1}.

    Is this because I'm doing it on localhost or is there any other cause?

    In this case, the tutorial will not continue, so please let us know if there is a solution.

    Finally、 I was sorry in ugly English.

    And,I am very grateful that your tutorial has been easy to understand so far.

    Thank,You.

  24. thank you for this great tutorial, i followed your code and get this error; Parse error: syntax error, unexpected ':', expecting ')' in C:xampphtdocsrobotindex.php on line 4n can you please help me to fix this error, or any one here had the same problem, thank you very much

  25. Hallow, thank you for this great tutorial, i try to do so with a link but it give me nothing; this is the link curl_setopt($ch, CURLOPT_URL, "http://soluciapj-vad.com/#/connexion"); i think the probleme is the char#, can you please try to see what happen with this link, thank you

  26. Notice: Trying to get property 'recent' of non-object

    in :
    $html->load($response->recent);
    could you explain this error please … cuz I couldn't find any solution for it…

Leave a Reply

Your email address will not be published. Required fields are marked *