Is it an Anagram?
English words are so interesting when people can rearrange each character in the word in a different order. It will bring another meaning compared to the original word. According to Wikipedia, Anagram is defined as:
An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once. The original word or phrase is known as the subject of the anagram. Any word or phrase that exactly reproduces the letters in another order is an anagram. Someone who creates anagrams may be called an "anagrammatist". (Wikipedia)
In Alteryx Community, in Challenge 59, the problem is comparing word 1 and word 2. If the number of each character in word 1 equals the number of each character in word 2, then it's an anagram.
My idea:
1/ Create a RecordID field for the dataset.
2/ Compare the length of word1 and word2:
3/ Split word 1 and word 2 into each character in each row. Then, count how many characters appeared in each word.
4/ Compare the number of characters that appeared in word 1 and word 2. If the number of each character in word 1 is equal to the number of each character in word 2, then return 1; Otherwise, return 0.
5/ Compare the number of characters of each word group by RecordID and the number of characters of each word after step 4.
6/ Union the data in step 2 (Not an anagram) with data in step 5. This is the final result.
Step 1: Use the Record ID tool to set RecordID for the dataset; Increment by 1.
Step 2: Compare the length of word 1 and the length of word 2 (Image 1)
Use Filter tool, output True if: Length([word1])==Length([word2]); Otherwise output False.
Step 3: Splitting words into characters in rows.
Use the RegEx tool to split word 1 and word 2 into characters. The regular expression is "." (not including quotation marks). The output Method is Tokenize. (Result in Image 2 is splitting word 1 into characters).
Then, count how many characters appeared in word 1 by using Summarize tool group by RecordID. (Image 3)
Do the same way in word 2. Then Join together (Image 3b).
Recommended by LinkedIn
Step 4: Compare the number of characters that appeared in word 1 and the number of characters that appeared in word 2.
Use the Formula tool to create the "equal_char" field. If the number of characters appearing in word 1 = number of characters in word 2, then return 1; otherwise, return 0. (Image 4)
if [Count_w1]=[Count_w2] THEN 1
ELSE 0
ENDIF
Example: in RecordID = 8, there is a total of 5 different characters. However, there are only 3 different characters that have the same number in count_w1 and count_w2. It means RecordID=8 is not an anagram.
Step 5: Compare the number of different characters in the word and the number of characters in the "equal_char" field.
Use Summarize tool to count how many different characters are grouped by RecordID. (*)
Use Summarize tool to sum the "equal_char" field group by RecordID. (**)
Use Multi-Join to join (*), (**), and word 1, and word 2 fields together. (Image5)
Then, use the Formula tool to create an Anagram field. If (*) = (**), then return True; Otherwise return False.
If [Count_char]=[Sum_equal_char]
THEN "Yes"
ELSE "No"
ENDIF
I got the result in Image 5b. In RecordID=8, the length of both words is the same, but the number of characters in each word is different. Therefore, the Anagram result is "No".
Step 6: Union with data which has different length in word 1 and word 2 in Step 1. (Image 6)
There are several ways to solve the Anagram challenge in Alteryx Community. How do you feel about my solution? How do you solve the Anagram challenge? Did you find any interesting things in solving this challenge? Please let me know in the comment.
My workflow:
Happy Learning!