We can divide all people into different sets based on the languages they understand and speak. Everyone in a set can communicate with others in the same set—that is, they can understand the languages spoken by each other. Finally, we subtract the number of people in the largest set from the total number of people, which gives us the minimum number of people that need to be excluded.
Step 1: Convert Strings to IDs
All the information provided in the input appears in string form, which is extremely inconvenient for graph construction and traversal. Therefore, we first record strings and assign corresponding IDs using a dictionary and an array.
The record_people function records each line of input, then converts them into Person structs and stores them in the people array, allowing direct access to each person via their ID later. The build_language_dict function creates a HashMap with String keys and usize values, enabling us to find the corresponding value by language name. After converting all strings to IDs, we can begin constructing the graph.
Step 2: Construct the Language Conversion Graph
For each person, they can convert the languages they understand to the language they speak. This is equivalent to drawing a directed edge from the understood language to the spoken language. By traversing each person in this way, we can build a graph of language conversion.
The function’s end uses a HashSet for edge deduplication. Values in the array are placed into a HashSet and then back into the array to achieve deduplication.
Step 3: Verify Two People Can Communicate
After the language graph is established, a function is needed to verify whether two people can understand each other’s spoken languages. If one person’s spoken language can be converted into one of the languages the other person understands, then it can be said that the second person can understand the first person’s speech, and vice versa. This is implemented using a naive DFS algorithm.
The test_biconnected function encapsulates test_connected_dfs. By calling this subfunction twice, it verifies whether two people can communicate.
Step 4: Construct a Disjoint Set Union
When analyzing the relationships between each person, a disjoint set union (DSU) data structure is used for organization. Everyone in a DSU can communicate with each other, and if two people from different DSUs can communicate, those DSUs are merged.
Path compression is used in the DSU implementation. Each call to get_fa updates the node’s parent to the root node of the set, thereby flattening the DSU and optimizing query efficiency.
Step 5: Obtain the Largest DSU Count
An array records the count of each DSU, and each node’s DSU is identified using get_fa, updating the count accordingly. The function returns the count of the largest DSU.
The provided code demonstrates a method for solving the problem by dividing people into sets based on language understanding and speaking, using data structures and algorithms such as hash maps, graphs, DFS, and disjoint set unions to efficiently find the minimum number of people that need to be excluded to allow for universal communication within the remaining group.