Java Parsing Twitter Tweets

This post was written by Brandon on December 10, 2010
Posted Under: Java,Twitter

Another method I needed for my Twitter Client I am developing was a way to parse the tweets. By this, I mean make links, hashtags and usernames clickable.

I did this using Regex.
To make use of the regex library, you will need to import:

 
import java.util.regex.*;
 

Here is the full code, pass in the tweet, and it will return the tweet with the proper linking.

 
  public String parseTweet(String inTweet){
    String patternStr = "(?:\\s|\\A)[##]+([A-Za-z0-9-_]+)";
    Pattern pattern = Pattern.compile(patternStr);
    Matcher matcher = pattern.matcher(inTweet);
    String foundValue = "";
    //hash tags
    while (matcher.find()){
      foundValue = matcher.group();
      foundValue = foundValue.replace(" ","");
      inTweet = inTweet.replace(foundValue, "<a href='http://search.twitter.com/search?q=" + foundValue + "'>" + foundValue + "</a>");
    }
 
    //Users
    patternStr = "(?:\\s|\\A)[@]+([A-Za-z0-9-_]+)";
    pattern = Pattern.compile(patternStr);
    matcher = pattern.matcher(inTweet);
    while (matcher.find()){
      foundValue = matcher.group();
      foundValue = foundValue.replace(" ","");
      String rawName = foundValue.replace("@","");
      inTweet = inTweet.replace(foundValue, "<a href='http://twitter.com/" + rawName + "'>" + foundValue + "</a>");
    }
 
    //links
    patternStr = "(^|[ \t\r\n])((ftp|http|https|mailto|aim|webcal|skype):(([A-Za-z0-9$_.+!*(),;/?:@&~=-])|%[A-Fa-f0-9]{2}){2,}(#([a-zA-Z0-9][a-zA-Z0-9$_.+!*(),;/?:@&~=%-]*))?([A-Za-z0-9$_+!*();/?:~-]))";
    pattern = Pattern.compile(patternStr);
    matcher = pattern.matcher(inTweet);
    while (matcher.find()){
      foundValue = matcher.group();
      foundValue = foundValue.replace(" ","");
      inTweet = inTweet.replace(foundValue, "<a href='" + foundValue + "' target='_blank'>" + foundValue + "</a>");
    }
    return inTweet;
  }
 

And thats it! One method to handle everything.

Add a Comment

required, use real name
required, will not be published
optional, your blog address