An API is an official interface to connect to a service, usually designed to make it easier for one application to interact with another. This is usually kept stable and provides only the information needed to serve the request of the application requesting it.
A scraper is an application that scrapes data from a human readable source (i.e. website) to obtain data from another application. Since website designs can update frequently, these scrapers can break at any time and need to be updated alongside the original application.
Reddit clients interact with an API to serve requests, but Newpipe scrapes the YouTube webpage itself. So if YouTube changes their UI tomorrow Newpipe could very easily break. No one wants to design their app around a fragile base while building a bunch of stuff on top of it. It’s just way too much work for very little effort.
It’s like I can enter my house through the door or the chimney. I would always take the door since it’s designed for human entry. I could technically use the chimney if there’s no door. But if someone lights up the fireplace I’d be toast.
Yep, one thing to add is that the only real way for YouTube to block this, would be to require a login before showing the webpage, for any users.
deleted by creator
Nothing but effort. Nobody wants to constantly baby a project just because someone else may change their code at a moment’s notice. Why would you want to comb through someone else’s html + obfuscated JavaScript to figure out how to grab some dynamically shown data when there was a well documented publicly available API?
Also NewPipe breaks all the time. APIs are generally stable, and can last years if not decades without changing at all. Meanwhile NewPipe parsing breaks every few weeks to months, requiring programmer intervention. Just check the project issue tracker and you’ll see it’s constantly being fixed to match YouTube changes.
deleted by creator
You mean a Reddit scraper? The other day I saw someone present their project RDX, which if I understand you correctly is just what you’re asking for.
deleted by creator
Well, from a personal perspective as a developer, maintaining an app that’s scraping another website for data doesn’t sound like a lot of fun. It’s not exactly enticing to put a lot of effort into something that could be broken on a whim.
API is like an easy gateway sites offer to make sure other developers can integrate data from themselves to their apps. Reddit used to be more permissive towards third party apps back when they didn’t have an official one; but now that their data is used for other purposes and they have an official app; they didn’t want to give out their data for free anymore.
All fediverse sites use an API called “ActivityPub” to have their content work with each other, so a lemmy account from lemmy.ml can follow another user from lemmy.world and their posts can be seen on a third party app.
NewPipe uses data scraping to get the data from youtube from the data the site gives. This is not using an API, it instead acts like a user and gets the data from the site itself. Sites can introduce various methods to distrupt data scraping; but cannot truly do it without crippling the anonymous user experience a lot. Lots of modern media sites have started to do so you might have realised.
Just to be super pedantic, ActivityPub is the protocol used to federate. Mastodon, Lemmy, KBin all have different APIs which use the activitypub protocol
deleted by creator
deleted by creator
/iHadAStroke
Wtf is that title?