For some reason Twitter’s frontend uses a hard-coded bearer token, at least for ...

1vuio0pswjnm7 · on April 13, 2023

FWIW, I have never logged in to Twitter and I have always been able to retrieve all tweets. At first, I used mobile.twitter.com in a text-only browser, no token required. Since they started using GraphQL, I retrieve tweets as JSON. They have changed the token once. The current one is

Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA

IME, the old token will not work.

YouTube does the same thing. I never run Javascript from YouTube. I do not use youtube-dl nor its JS interpreter written in Python. I search YouTube and retrieve YouTube JSON from the command line.

It's funny how people commenting on HN often automatically assume the presence of a token is some sort of "security".

For YouTube search and browse I use "WEB" key AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8

For YouTube player I use "ANDROID" key AIzaSyA8eiZmM1FaDVjRy-df2KTyQ_vz_yYM39w

It's like how web pages used to (and probably still do) use "type=hidden" in HTML forms to submit some value that the user does not enter. Hideen does not mean "secret" it just means not visible on the rendered page.

There's an obvious expectation that some users look at HTTP response headers and HTML when there's headers like "If you're reading this, we're hiring" and silly ASCII art in the HTML that's obviously meant for an external audience. YouTube even has some nonsensical line about a "robot uprising in the year 2000" in its robots.txt.

1vuio0pswjnm7 · on April 14, 2023

Here's an example of a site using GraphQL without using a token. A simple HN search script to fetch Algolia JSON. No need to be logged in to HN.

   #!/bin/sh
   test $# -gt 0||exec echo usage: $0 query
   DATA=$(echo '{"query":"'$@'","analyticsTags":["web"],"page":0,"hitsPerPage":30,"minWordSizefor1Typo":4,"minWordSizefor2Typos":8,"advancedSyntax":true,"ignorePlurals":false,"clickAnalytics":true,"minProximity":7,"numericFilters":[],"tagFilters":["story",[]],"typoTolerance":"min","queryType":"prefixNone","restrictSearchableAttributes":["title","comment_text","url","story_text","author"],"getRankingInfo":true}');
   HOST=uj5wyc0l7x-3.algolianet.com
   _PATH="/1/indexes/Item_production_sort_date/query?x-algolia-agent=Algolia%20for%20JavaScript%20(4.0.2)%3B%20Browser%20(lite)&x-algolia-api-key=8ece23f8eb07cd25d40262a1764599b1&x-algolia-application-id=UJ5WYC0L7X"
   # HTTP client (curl)
   #curl -A "" -d "$DATA" "https://$HOST$_PATH"
   # TCP client
   #echo "
   #foreground=no
   #[x]
   #accept=127.0.0.8:80
   #client=yes
   #connect=167.114.119.142:443
   #options=NO_TICKET
   #options=NO_RENEGOTIATION
   #renegotiation=no
   #sni=
   #sslVersion=TLSv1.3
   #" |stunnel -fd 0;
   #tr @ '\r' <<eof|openssl s_client -connect $HOST:443 -ign_eof
   #tr @ '\r' <<eof|bssl s_client -connect $HOST:443 
   #tr @ '\r' <<eof|nc -vvn 127.8 80
   tr @ '\r' <<eof|socat stdio,ignoreeof ssl:$HOST:443,verify=0
   POST $_PATH HTTP/1.1@
   host: $HOST@
   content-length: ${#DATA}@
   content-type: x-www-form-urlencoded@
   connection: close
   @
   $DATA
   eof
   #x=$(ps ax|sed -n "/stunnel.-fd.0/{s/ *//;s/ .*//p;q}")
   #test ! $x||kill $x

1vuio0pswjnm7 · on April 15, 2023

Anyone who monitors what is being sent from their own computers over their own networks sees the Bearer token.

Everyone, including any member of the public, who visits twitter.com gets the same Bearer token.

No need to have an "account" with Twitter or to be "logged in".

One can simulate this with cURL.

   js=$(curl -sA "" https://twitter.com|grep -m1 -o "https://abs.twimg.com/responsive-web/client-web-legacy/main[^\"]*");
   curl -A "" $js|tr , '\n'|grep -o \"AAAA.*\"

The same Bearer token value is used by people around the web for retrieving public tweets. It's public information. For example,

https://stackoverflow.com/questions/61140863/python-download...

https://github.com/twintproject/twint/raw/master/twint/run.p...

https://pypi.org/project/ScrapeTweets/

https://stackoverflow.com/questions/67137294/twitter-scrapin...

https://github.com/m4fn3/pytweetdeck/blob/master/pytweetdeck...

https://github.com/jonbakerfish/TweetScraper/issues/127

https://github.com/JustAnotherArchivist/snscrape/issues/536

https://gist.github.com/codemasher/67ba24cee88029a3278c87ff9...

https://github.com/HoloArchivists/twspace-dl/issues/26

https://gist.github.com/AzureFlow/01cff883b9f1b22e8d0c094df9...

https://greasyfork.org/hu/scripts/454409-video-downloader-fo...

https://gist.github.com/moxak/ed83dd4169112a0b1669500fe85510...

https://gist.github.com/ceres-c/7c16a40c10cb476cce2c4b902334...

https://gist.github.com/theowenyoung/d4a62746025f7af8cdd8bfb...

userbinator · on April 13, 2023

I believe YouTube does the same thing.

If the backend is going to perform operations in the context of an identity, it makes sense to consistently give one to all users, including anonymous ones.

jaggederest · on April 13, 2023

I do this a lot, good ol' 0xDEADBEEF makes it easier to track whether the header is actually missing (eg misconfigured) or just undefined but coming through correctly.