Hello! 👋

  • 0 Posts
  • 4 Comments
Joined 2 years ago
cake
Cake day: June 11th, 2023

help-circle

  • Every time I read news about Trump I always wonder what his end goal is. Because he sounds so much like those end users who says unreasonable things like “you need to add/fix/change this” if it is done then they always comes back for more because they didn’t get the end come they wanted and then it all turns out they wanted something that was reasonable (or not) but they demand changes that won’t do that.

    Is his end goal really getting more jobs?

    And I wonder what kind of jobs he wish to create because a lot of Americans seem to have many weird jobs already like just standing in a corner pointing in a direction or taking care of your filled form by walking 3 meters to another person with it.



  • This kinda reminds me of pirating vs paying. Using api = you know it will always be the same structure and you will get the data you asked for. Otherwise you will be notified unless they version their api. There is usual good documentation. You can always ask for help.

    Scraping = you need to scout the whole website yourself. you need to keep up to date with the the websites structure and to make sure they haven’t added ways to block bots (scraping). Error handling is a lot more intense on your end, like missing content, hidden content, query for data. the website may not follow the same standards/structuree throughout the website so you need to have checks for when to use x to get y. The data may need multiple request because they do not show for example all the user settings on one page but in an api call they would or it is a ajax page and you need to run Javascript scripts and click on buttons that may change id, class or text info and they may load data when you do x with Javascript so you need to emulate the webpage.

    So my guess is that scraping is used most often when you only need to fetch simple data structures and you are fine with cleaning up the data afterwards. Like all the text/images on a page, checking if a page has been updated or just save the whole page like wayback machine.