Day 16: Reindeer Maze

Megathread guidelines

  • Keep top level comments as only solutions, if you want to say something other than a solution put it in a new post. (replies to comments can be whatever)
  • You can send code in code blocks by using three backticks, the code, and then three backticks or use something such as https://topaz.github.io/paste/ if you prefer sending it through a URL

FAQ

  • Pyro@programming.dev
    link
    fedilink
    arrow-up
    1
    ·
    10 days ago

    Those are some really great optimizations, thank you! I understand what you’re doing generally, but I’ll have to step through the code myself to get completely familiar with it.

    It’s interesting that string operations win out here over graph algorithms even though this is technically a graph problem. Honestly your write-up and optimizations deserve its own post.

    • Acters@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      9 days ago

      I tried to compartmentalize it. the search is on its own function, and while that fill_in_dead_ends function is extremely large, it is a lot of replicated code. match k case statement could just be removed. A lot of the code is extremely verbose and has an air of being “unrolled” for purposes of me just tweaking each part of the process individually to see what could be done. The entire large af match case all seemingly ended up being very similar code. I could condense it down a lot. however, I know doing so would impact performance unless plenty of time is spent on tweaking it. So unrolled copy pasta was good.

      The real shining star is the find_next_dead_end function because the regex before took 99% of the time of about ~300 ms seconds. Even with this fast iterative function, the find_next_dead_end still takes about 75% of the time for the entire thing to finish filling in dead ends. This is because as the search ran deeper into the string, it would start slowing down because it was like O(n*m) time complexity, where n in line width and m is line count being searched through until next match. My approach was to store the relative position for each search which conveniently was the curr_row,curr_col. Another aspect to reduce cost/time complexity on the logic that would make sure it filled in newly created dead-ends was to simply check if the current search for the next dead end was from the start after it finished checking the final line. Looking at the line by line profiler from iPython, the entire function spends most of the time at the while('.' in r[:first_loc]): and first_loc = r[:first_loc].rindex('.') which is funny because that is still fast at over 11k+ hits on the same line with only a 5-5.5 microsecond impact for each time it ran the lines.

      though I could likely remove that strange logic by moving it into the find_next_dead_end instead of having that strange if elif else statement in the fill_in_dead_ends logic.

      there is so much possible to make it improved, but it was quick and dirty.

      Now that I am thinking about it, there would be a way to make the regex faster by simply string slicing lines off the search, so that the regex doesn’t spend time looking at the same start of string.

    • Acters@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      9 days ago

      ah yes, I was right. simply string slicing away lines that were checked does make the regex faster. while the code is smaller, it is still slower than the more verbose option. which is only because of the iterative approach of checking each node in the While(True) loop, instead of building 2 lists of lines and manipulating them with .index() and .rindex() [ Paste ]

      However, if you notice, even the regex is slower than my iterative approach with index by 3-5 milliseconds. While having one line for the regex is nice, I do think it is harder to read and could prove to be slightly more cumbersome as it could be useless in other types of challenges, while the iterative approach is nice and easily manipulable for most circumstances that may have some quirks.

      Also, it shows that the more verbose option is still faster by 7 ms because of the fact that checking each node in the While(True) loop is rather slow approach. So really, there is nothing to it overall, and the main slow down is in you solver that I didn’t touch at all, because I wanted to only show the dead end filling part.

    • Acters@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      9 days ago

      If you are wondering how my string operations is able to be fast, it is because of the simple fact that python’s index and rindex are pactically O(n) time.(which for my use of it after slicing the string, it is closer to O(log(n)) time ) here are some more tricks in case you wish to think about that more. [link] Also, the more verbose option is simply tricks in batch processing. why bother checking each node individually, when we already know that a dead end is simply straight lines?

      If there was an exceedingly large maze was just a simple two spirals design, where one is a dead end and another has the “end flag” then my batch processing would simply outpace the slower per node iterator. in this scenario, there is a 50/50 chance you pick the right spiral, while it is just easier to look for which one is a dead end and just backtrack to chose the other option. technically it is slower than just guessing correctly first try, but that feels awfully similar to how a bogosort works. you just randomly choose paths(removing previously checked paths) or deterministically enumerate all paths. while a dead end is extremely easy to find and culls all those paths as extremely low priority, or in this spiral scenario, it is the more safe option than accidentally choosing the wrong path.

      What would be the fastest would be to simply convert this to bit like representations. each wall could be 1, and empty spots could be 0. would have to be mindful of the location of the start and end separately.