{"id":39794,"date":"2018-07-23T09:26:53","date_gmt":"2018-07-23T09:26:53","guid":{"rendered":"https:\/\/www.flubu.com\/blog\/?p=39794"},"modified":"2018-07-23T09:53:58","modified_gmt":"2018-07-23T09:53:58","slug":"optimized-solution-for-periodic-words","status":"publish","type":"post","link":"https:\/\/www.flubu.com\/blog\/2018\/07\/23\/optimized-solution-for-periodic-words\/","title":{"rendered":"Optimized solution for periodic words"},"content":{"rendered":"<p>The <a href=\"https:\/\/www.flubu.com\/blog\/2018\/07\/19\/words-written-in-periodic-table-elements\/\">previous solution I coded<\/a> for the problem of finding words written in periodic element symbols didn&#8217;t sit well with me. It worked, but it was sub-optimal and was too rigid in how it tried to find matches.  A more complete and elegant solution is below and will discover *all* possible variants of elements for a given word. At each position in the word, it uses both 1- and 2-letter elements and branches out. The initial recursion will generate many more incomplete\/partial matches, but then the finalizing pass will recompose and flatten all paths through the graph and then cleanup after itself. This makes me happy.<\/p>\n<p>When running this code, the results are:<\/p>\n<p>Number of words found: 9092 \/\/ <b>a 22% increase<\/b><\/p>\n<p>Longest word: Internationalisation<br \/>\nI, N, Te, Rn, At, I, O, N, Al, I, S, At, I, O, N<br \/>\nIn, Te, Rn, At, I, O, N, Al, I, S, At, I, O, N<br \/>\nI, N, Te, Rn, At, I, O, Na, Li, S, At, I, O, N<\/p>\n<p>Longest word with unique variants: Bureaucratisation<br \/>\nB, U, Re, Au, Cr, At, I, S, At, I, O, N<br \/>\nB, U, Re, Au, C, Ra, Ti, S, At, I, O, N<\/p>\n<p>Longest word with most variants: Consciousnesses<br \/>\nC, O, N, S, C, I, O, U, S, Ne, S, Se, S<br \/>\nC, O, N, S, C, I, O, U, S, Ne, S, S, Es<br \/>\nC, O, N, S, C, I, O, U, S, N, Es, Se, S<br \/>\nC, O, N, S, C, I, O, U, S, N, Es, S, Es<br \/>\nC, O, N, Sc, I, O, U, S, Ne, S, Se, S<br \/>\nC, O, N, Sc, I, O, U, S, Ne, S, S, Es<br \/>\nC, O, N, Sc, I, O, U, S, N, Es, Se, S<br \/>\nC, O, N, Sc, I, O, U, S, N, Es, S, Es<br \/>\nC, O, N, S, C, I, O, U, SN, Es, Se, S<br \/>\nC, O, N, S, C, I, O, U, SN, Es, S, Es<br \/>\nCo, N, S, C, I, O, U, S, Ne, S, Se, S<br \/>\nCo, N, S, C, I, O, U, S, Ne, S, S, Es<br \/>\nCo, N, S, C, I, O, U, S, N, Es, Se, S<br \/>\nCo, N, S, C, I, O, U, S, N, Es, S, Es<\/p>\n<p>Longest word with most unique variants: Carbines<br \/>\nC, Ar, B, I, Ne, S<br \/>\nC, Ar, B, I, N, Es<br \/>\nCa, Rb, I, Ne, S<br \/>\nCa, Rb, I, N, Es<br \/>\nC, Ar, Bi, Ne, S<br \/>\nC, Ar, Bi, N, Es<br \/>\nC, Ar, B, In, Es<\/p>\n<p>Elements that were never used in any word:<br \/>\nBk, Cd, Cf, Cm, Fm, Gd, Hg, Md, Mg, Sg, Sr, Zn, Zr<\/p>\n<pre class=\"brush: java; collapse: true; light: false; title: Click to view the full source code; toolbar: true; notranslate\" title=\"Click to view the full source code\">\r\nimport java.io.*;\r\nimport java.util.*;\r\n\r\npublic class PeriodicWordFinder {\r\n\r\n    private TreeSet&lt;String&gt; words;\r\n    private TreeSet&lt;String&gt; elements;\r\n    private ArrayList&lt;PeriodicWord&gt; periodicWords;\r\n\r\n    public PeriodicWordFinder() {\r\n        \r\n        try {\r\n            words = read(&quot;wordlist.txt&quot;);\r\n            elements = new TreeSet&lt;&gt;((s1, s2) -&gt; {\r\n                int lenthComparison = Integer.compare(s1.length(), s2.length());\r\n                if (lenthComparison == 0)\r\n                    return s1.compareTo(s2);\r\n                else\r\n                    return lenthComparison;\r\n            });\r\n            elements.addAll(read(&quot;elements.txt&quot;));\r\n            periodicWords = new ArrayList&lt;&gt;();\r\n            System.out.println(&quot;words.size() = &quot; + words.size());\r\n            System.out.println(&quot;elements.size() = &quot; + elements.size());\r\n        } catch (Exception e) {\r\n            e.printStackTrace();\r\n        }\r\n    }\r\n\r\n    private TreeSet&lt;String&gt; read(String fileName) throws Exception {\r\n\r\n        TreeSet&lt;String&gt; retval = new TreeSet&lt;&gt;();\r\n        InputStream stream = getClass().getClassLoader().getResourceAsStream(fileName);\r\n        BufferedReader in = new BufferedReader(new InputStreamReader(stream));\r\n        String line;\r\n        while ((line = in.readLine()) != null) {\r\n            line = line.toUpperCase().trim();\r\n            \/\/no elements contain Q or J\r\n            if (line.contains(&quot;Q&quot;) || line.contains(&quot;J&quot;)) {\r\n                continue;\r\n            }\r\n            retval.add(line);\r\n        }\r\n        in.close();\r\n        return retval;\r\n\r\n    }\r\n\r\n    private void findWords() {\r\n\r\n        for (String word : words) {\r\n            PeriodicWord pWord = new PeriodicWord(word);\r\n            if (periodicWord(1, pWord)) {\r\n                periodicWords.add(pWord);\r\n            }\r\n        }\r\n\r\n        \/\/the new variant searcher algo will find partial\/incomplete matches\r\n        \/\/this needs to be filtered out by the finalizeVariants method\r\n        finalizeVariants();\r\n        for (PeriodicWord pWord : periodicWords) {\r\n            System.out.println(&quot;found: &quot; + pWord.getInitialWord());\r\n        }\r\n    }\r\n\r\n    private boolean periodicWord(long variantId, PeriodicWord word) {\r\n\r\n        if (word.getWord(variantId).length() == 0)\r\n            return true;\r\n\r\n        \/*COSEU\r\n        -- 11 C  -- 111  O  -- 1111 S  -- 11111 Eu\r\n                 -- 111  O  -- 1112 Se -- 11121 U\r\n                 -- 112  Os -- 1121 Eu\r\n        -- 12 Co -- 121  S  -- 1211 Eu\r\n                 -- 122  Se -- 1221 U\r\n         *\/\r\n\r\n        long ndx = 1;\r\n        boolean matchFound = false;\r\n        for (String element : elements) {\r\n            if (word.getWord(variantId).startsWith(element)) {\r\n                matchFound = true;\r\n                long newVariantId = variantId * 10 + ndx++;\r\n                String alreadyPresent = word.setElement(newVariantId, element);\r\n                \/\/if (alreadyPresent != null) {\r\n                \/\/    System.err.println(&quot;Should not already be element &quot; + alreadyPresent + \r\n                \/\/    &quot; at newVariantId &quot; + variantId + &quot; for word&quot; + word.getInitialWord());\r\n                \/\/}\r\n                word.setWord(newVariantId, word.getWord(variantId).substring(element.length()));\r\n                if (!periodicWord(newVariantId, word)) {\r\n                    \/\/we found a partial match\r\n                    \/\/it will be dealt with in finalize\r\n                }\r\n            }\r\n        }\r\n        return matchFound;\r\n\r\n    }\r\n\r\n    private void finalizeVariants() {\r\n\r\n        for (Iterator&lt;PeriodicWord&gt; it = periodicWords.iterator(); it.hasNext(); ) {\r\n\r\n            PeriodicWord pWord = it.next();\r\n\r\n            TreeMap&lt;Long, String&gt; elementVariants = pWord.getElementVariants();\r\n            boolean allFound = false;\r\n            while (!allFound &amp;&amp; !elementVariants.isEmpty()) {\r\n\r\n                \/\/start from the end of the collection and work your way backwards,\r\n                \/\/collapsing the paths in the graph\r\n                String lastElement = elementVariants.lastEntry().getValue().toUpperCase();\r\n                if (pWord.getInitialWord().toUpperCase().endsWith(lastElement)) {\r\n                    \/\/build a word\r\n                    List&lt;String&gt; elementVariant = buildVariant(elementVariants);\r\n                    \/\/add last sanity check, check that you don't return a partial word\r\n                    String elVarStr = String.join(&quot;&quot;, elementVariant).toUpperCase();\r\n                    if (pWord.getInitialWord().equals(elVarStr)) {\r\n                        pWord.addElementWordVariant(elementVariant);\r\n                    }\r\n                    \/\/remove tail of collection and continue\r\n                    elementVariants.remove(elementVariants.lastKey());\r\n\r\n                } else {\r\n                    \/\/all words found, stop loop\r\n                    allFound = true;\r\n                }\r\n            }\r\n\r\n            \/\/if only partial words were mapped - no full words - remove as invalid\r\n            if (pWord.elementWordVariants.isEmpty()) {\r\n                it.remove();\r\n            }\r\n        }\r\n\r\n    }\r\n\r\n    private List&lt;String&gt; buildVariant(TreeMap&lt;Long, String&gt; elementVariants) {\r\n\r\n        \/*COSEU\r\n                -- 11 C  -- 111  O  -- 1111 S  -- 11111 Eu\r\n                         -- 111  O  -- 1112 Se -- 11121 U\r\n                         -- 112  Os -- 1121 Eu\r\n                -- 12 Co -- 121  S  -- 1211 Eu\r\n                         -- 122  Se -- 1221 U\r\n         *\/\r\n        List&lt;String&gt; retval = new ArrayList&lt;&gt;();\r\n        boolean done = false;\r\n        long key = elementVariants.lastKey();\r\n        while (!done) {\r\n            String val = elementVariants.get(key);\r\n            if (val != null) {\r\n                retval.add(0, val);\r\n            } else {\r\n                done = true;\r\n            }\r\n            key = key \/ 10;\r\n        }\r\n        return retval;\r\n    }\r\n\r\n    public List&lt;PeriodicWord&gt; getPeriodicWords() {\r\n        return periodicWords;\r\n    }\r\n\r\n    public Set&lt;String&gt; getElements() {\r\n        return elements;\r\n    }\r\n\r\n    public static void main(String&#x5B;] args) {\r\n\r\n        PeriodicWordFinder ew = new PeriodicWordFinder();\r\n        ew.findWords();\r\n\r\n        \/\/now we have fun\r\n        List&lt;PeriodicWord&gt; pWords = ew.getPeriodicWords();\r\n        System.out.println(&quot;Number of words found: &quot; + pWords.size());\r\n\r\n        \/\/longest word\r\n        pWords.sort((o1, o2) -&gt; Integer.compare(o1.getInitialWord().length(), o2.getInitialWord().length()));\r\n        System.out.println(&quot;Longest word: &quot; + pWords.get(pWords.size() - 1));\r\n\r\n        \/\/longest word with only unique elements\r\n        pWords.sort((o1, o2) -&gt; {\r\n            String s1 = String.join(&quot;&quot;, o1.getVariantWithMostElements(true));\r\n            String s2 = String.join(&quot;&quot;, o2.getVariantWithMostElements(true));\r\n            int lenthComparison = Integer.compare(s1.length(), s2.length());\r\n            if (lenthComparison == 0)\r\n                return s1.compareTo(s2);\r\n            else\r\n                return lenthComparison;\r\n        });\r\n        System.out.println(&quot;Longest word with unique variants: &quot; + pWords.get(pWords.size() - 1));\r\n\r\n        \/\/longest word with with most variants\r\n        pWords.sort((o1, o2) -&gt; {\r\n            int numberComparison = Integer.compare(o1.getNumberOfVariants(false), o2.getNumberOfVariants(false));\r\n            if (numberComparison == 0) {\r\n                int lenthComparison = Integer.compare(o1.getInitialWord().length(), o2.getInitialWord().length());\r\n                if (lenthComparison == 0)\r\n                    return o1.getInitialWord().compareTo(o2.getInitialWord());\r\n                else\r\n                    return lenthComparison;\r\n            } else {\r\n                return numberComparison;\r\n            }\r\n        });\r\n        System.out.println(&quot;Longest word with most variants: &quot; + pWords.get(pWords.size() - 1));\r\n\r\n        \/\/longest word with most unique variants\r\n        pWords.sort((o1, o2) -&gt; {\r\n            int numberComparison = Integer.compare(o1.getNumberOfVariants(true), o2.getNumberOfVariants(true));\r\n            if (numberComparison == 0) {\r\n                int lenthComparison = Integer.compare(o1.getInitialWord().length(), o2.getInitialWord().length());\r\n                if (lenthComparison == 0)\r\n                    return o1.getInitialWord().compareTo(o2.getInitialWord());\r\n                else\r\n                    return lenthComparison;\r\n            } else {\r\n                return numberComparison;\r\n            }\r\n        });\r\n        System.out.println(&quot;Longest word with most unique variants: &quot; + pWords.get(pWords.size() - 1));\r\n\r\n        \/\/is there any element that was not used?\r\n        TreeSet&lt;String&gt; elements = new TreeSet&lt;&gt;(ew.getElements());\r\n        for(PeriodicWord pWord : pWords){\r\n            for (List&lt;String&gt; variantElements : pWord.elementWordVariants ){\r\n                elements.removeAll(variantElements);\r\n                if (elements.isEmpty()){\r\n                    break;\r\n                }\r\n            }\r\n        }\r\n        System.out.println(&quot;Elements not used: &quot; + elements);\r\n    }\r\n\r\n    private class PeriodicWord {\r\n\r\n        private TreeMap&lt;Long, String&gt; wordVariants; \/\/will get mangled during recursion\r\n        private String initialWord; \/\/keep track of initial word\r\n        private TreeMap&lt;Long, String&gt; elementVariants;  \/\/will get updated during recursion\r\n        private List&lt;List&lt;String&gt;&gt; elementWordVariants;\r\n\r\n        public int getNumberOfVariants(boolean onlyUnique) {\r\n            List&lt;List&lt;String&gt;&gt; sorted = new ArrayList&lt;&gt;();\r\n            sorted.addAll(elementWordVariants);\r\n            if (onlyUnique) {\r\n                sorted.removeIf(this::containsDuplicateElements);\r\n            }\r\n            return sorted.size();\r\n        }\r\n\r\n        public List&lt;String&gt; getVariantWithMostElements(boolean onlyUnique) {\r\n            if (elementWordVariants.size() == 1) {\r\n                return elementWordVariants.get(0);\r\n            } else {\r\n                List&lt;List&lt;String&gt;&gt; sorted = new ArrayList&lt;&gt;();\r\n                sorted.addAll(elementWordVariants);\r\n                if (onlyUnique) {\r\n                    sorted.removeIf(this::containsDuplicateElements);\r\n                }\r\n                sorted.sort((o1, o2) -&gt; Integer.compare(o1.size(), o2.size()));\r\n                if (sorted.size() &gt; 0) {\r\n                    return sorted.get(sorted.size() - 1);\r\n                } else {\r\n                    return new ArrayList&lt;&gt;();\r\n                }\r\n            }\r\n        }\r\n\r\n        private boolean containsDuplicateElements(List&lt;String&gt; elementVariant) {\r\n            return new HashSet&lt;&gt;(elementVariant).size() != elementVariant.size();\r\n        }\r\n\r\n        public PeriodicWord(String word) {\r\n            initialWord = word;\r\n            elementVariants = new TreeMap&lt;&gt;();\r\n            wordVariants = new TreeMap&lt;&gt;();\r\n            wordVariants.put(1l, word);\r\n            elementWordVariants = new ArrayList&lt;&gt;();\r\n        }\r\n\r\n        public void setWord(long variantId, String word) {\r\n            wordVariants.put(variantId, word);\r\n        }\r\n\r\n        public TreeMap&lt;Long, String&gt; getElementVariants() {\r\n            return elementVariants;\r\n        }\r\n\r\n        public String getWord(long variantId) {\r\n            return wordVariants.get(variantId);\r\n        }\r\n\r\n        public String setElement(long variantId, String element) {\r\n            return elementVariants.put(variantId, element);\r\n        }\r\n\r\n        public String getInitialWord() {\r\n            return initialWord;\r\n        }\r\n\r\n        public void addElementWordVariant(List&lt;String&gt; elementWordVariant) {\r\n            elementWordVariants.add(elementWordVariant);\r\n        }\r\n\r\n        @Override\r\n        public String toString() {\r\n            return &quot;PeriodicWord{&quot; +\r\n                    &quot;initialWord='&quot; + initialWord + '\\'' +\r\n                    &quot;, elementWordVariants=&quot; + elementWordVariants +\r\n                    '}';\r\n        }\r\n\r\n    }\r\n}\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>The previous solution I coded for the problem of finding words written in periodic element symbols didn&#8217;t sit well with me. It worked, but it was sub-optimal and was too rigid in how it tried to find matches. A more complete and elegant solution is below and will discover *all* possible variants of elements for&#8230;<\/p>\n<p class=\"more-link-wrap\"><a href=\"https:\/\/www.flubu.com\/blog\/2018\/07\/23\/optimized-solution-for-periodic-words\/\" class=\"more-link\">Read More<span class=\"screen-reader-text\"> &ldquo;Optimized solution for periodic words&rdquo;<\/span> &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[17],"class_list":["post-39794","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-geek"],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3u9vK-alQ","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/posts\/39794","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/comments?post=39794"}],"version-history":[{"count":5,"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/posts\/39794\/revisions"}],"predecessor-version":[{"id":39799,"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/posts\/39794\/revisions\/39799"}],"wp:attachment":[{"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/media?parent=39794"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/categories?post=39794"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.flubu.com\/blog\/wp-json\/wp\/v2\/tags?post=39794"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}