Page cover image

Generating analysis of Twitter account using Chatflow Agent

Introduction

In Dify, you can use some crawler tools, such as Jina, which can convert web pages into markdown format that LLMs can read.

Recently, wordware.ai has brought to our attention that we can use crawlers to scrape social media for LLM analysis, creating more interesting applications.

However, knowing that X (formerly Twitter) stopped providing free API access on February 2, 2023, and has since upgraded its anti-crawling measures. Tools like Jina are unable to access X's content directly.

Starting February 9, we will no longer support free access to the Twitter API, both v2 and v1.1. A paid basic tier will be available instead 🧵

— Developers (@XDevelopers) February 2, 2023

Fortunately, Dify also has an HTTP tool, which allows us to call external crawling tools by sending HTTP requests. Let's get started!

Prerequisites

Register Crawlbase

Crawlbase is an all-in-one data crawling and scraping platform designed for businesses and developers.

Moreover, using Crawlbase Scraper allows you to scrape data from social platforms like X, Facebook and Instagram.

Click to register: crawlbase.com

Deploy Dify locally

Dify is an open-source LLM app development platform. You can choose cloud service or deploy it locally using docker compose.

In this article, If you don’t want to deploy it locally, register a free Dify Cloud sandbox account here: https://cloud.dify.ai/signin.

Dify Cloud Sandbox users get 200 free credits, equivalent to 200 GPT-3.5 messages or 20 GPT-4 messages.

The following are brief tutorials on how to deploy Dify:

Clone Dify

git clone https://github.com/langgenius/dify.git

Start Dify

cd dify/docker
cp .env.example .env
docker compose up -d

Configure LLM Providers

Configure Model Provider in account setting:

Create a chatflow

Now, let's get started on the chatflow.

Click on Create from Blank to start:

The initialized chatflow should be like:

Add nodes to chatflow

Start node

In start node, we can add some system variables at the beginning of a chat. In this article, we need a Twitter user’s ID as a string variable. Let’s name it id .

Click on Start node and add a new variable:

Code node

According to Crawlbase docs, the variable url (this will be used in the following node) should be https://twitter.com/ + user id , such as https://twitter.com/elonmusk for Elon Musk.

To convert the user ID into a complete URL, we can use the following Python code to integrate the prefix https://twitter.com/ with the user ID:

def main(id: str) -> dict:
    return {
        "url": "https://twitter.com/"+id,
    }

Add a code node and select python, and set input and output variable names:

HTTP request node

Based on the Crawlbase docs, to scrape a Twitter user’s profile in http format, we need to complete HTTP request node in the following format:

Importantly, it is best not to directly enter the token value as plain text for security reasons, as this is not a good practice. Actually, in the latest version of Dify, we can set token values in Environment Variables. Click env - Add Variable to set the token value, so plain text will not appear in the node.

Check https://crawlbase.com/dashboard/account/docs for your crawlbase API Key.

By typing / , you can easily insert the API Key as a variable.

Tap the start button of this node to check whether it works correctly:

LLM node

Now, we can use LLM to analyze the result scraped by crawlbase and execute our command.

The value context should be body from HTTP Request node.

The following is a sample system prompt.

Test run

Click Preview to start a test run and input twitter user id in id

For example, I want to analyze Elon Musk's tweets and write a tweet about global warming in his tone.

Does this sound like Elon? lol

Click Publish in the upper right corner and add it in your website.

Have fun!

Lastly…

Other X(Twitter) Crawlers

In this article, I’ve introduced crawlbase. It should be the cheapest Twitter crawler service available, but sometimes it cannot correctly scrape the content of user tweets.

The Twitter crawler service used by wordware.ai mentioned earlier is Tweet Scraper V2, but the subscription for the hosted platform apify is $49 per month.

Last updated