Script to Get All Liked Items on Twitter (X)

Hello, I'm incompetent. I can scrape Twitter using an unofficial API called Twikit, but for a while now, the sample code has been recommending asynchronous operations. After upgrading, I found myself being told to do everything asynchronously, so I've fixed everything.
haturatu/xd_likes
In this way, I first store everything in all_tweets.

Once the specified maximum value is retrieved,

I retrieve all tweet contents from the tweets in all_tweets.

Since all of this is written to fav.log, you can extract all information using grep if needed.

API Rate Limits

https://github.com/d60/twikit/blob/main/ratelimits.md

As you can see, API rate limits vary quite a bit.

Therefore, it was necessary to try several times with retry settings.

# Retry settings  
RETRY_LIMIT = 10  
RETRY_DELAY = 300  
  
# Function to execute on retry  
async def perform_request_with_retries(request_func, *args, **kwargs):  
    for attempt in range(RETRY_LIMIT):  
        try  
            response = await request_func(*args, **kwargs)  # Await asynchronous request  
            if response:  
                return response  
        except ReadTimeout:  
            logger.warning(f"Attempt {attempt + 1} failed due to ReadTimeout.")  
        except Exception as e:  
            logger.error(f"Attempt {attempt + 1} failed: {e}")  
        await asyncio.sleep(RETRY_DELAY)  
    raise Exception("Failed to fetch more tweets after retries.")

Since API limits seem to reset every 15 minutes, I think I've created enough time leeway if I can try for a total of 50 minutes. This part was quite a bottleneck for me; pushing it to the limit consumed an infinite amount of time, so I drastically increased the maximum value.

Image Saving

I've separated the script for downloading from the one for getting likes.

Image saving is done in xd.py.

If left as standard, images cannot be saved in the highest quality, so I add the 'original' parameter to the acquired URL.

        while user_tweets:  
            for tweet in user_tweets:  
                for media in tweet.media:  
                    url = media['media_url_https']  
                    if url.endswith('.jpg'):  
                        url = url.replace('.jpg', '?format=jpg&amp;name=orig')  
                    clean_url = get_clean_url(url)  
                    save_path = os.path.join(save_folder, f"{screen_name}_{clean_url.split('/')[-1]}.jpeg")  
                    await download_image(url, save_path)

PNG offers better image quality, but perhaps only images uploaded as PNG exist? Also, PNGs consume a lot of storage, so I compromised with JPG.

Folders are created for each ID to be acquired, and images are saved there, with IDs also assigned to the images, a thoughtful specification that even Japanese users would appreciate.

Currently, I couldn't find any other similar scripts, so I decided to publish this one. Or rather, people probably don't want to publish them because they tend to do these things secretly...

By the way, since fav.log contains the URLs of liked posts, if you only want to save images from liked posts, you'd need to do it separately, but it's possible. That's why I put everything into fav.log.

Actually, I thought it might be better to separate fav.log from the log when fetching all tweets, but I'm tired now, so I'll stop here.

That's all.

See you again.

API Rate Limits

Image Saving

Related Posts