Solution of Cantina of Babel

Problem Description

link

Problem Analysis

We can divide all people into different sets based on the languages they understand and speak. Everyone in a set can communicate with others in the same set—that is, they can understand the languages spoken by each other. Finally, we subtract the number of people in the largest set from the total number of people, which gives us the minimum number of people that need to be excluded.

Step 1: Convert Strings to IDs

All the information provided in the input appears in string form, which is extremely inconvenient for graph construction and traversal. Therefore, we first record strings and assign corresponding IDs using a dictionary and an array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
fn build_language_dict(people: &Vec<Person>) -> HashMap<&String, usize> {
let mut lang_dict = HashMap::new();
let mut num_of_lang = 0usize;
for person in people {
for lang in person.understand.borrow() {
if lang_dict.contains_key(lang) {
continue;
}
lang_dict.insert(lang, num_of_lang);
num_of_lang += 1;
}
}
lang_dict
}
fn record_people(n: usize) -> Vec<Person> {
let mut people: Vec<Person> = vec![];
for _ in 0..n {
let input = read_line_of_string();
people.push(Person {
speak: input[1].to_owned(),
understand: HashSet::new(),
});
let id = people.len() - 1;
for i in 1..input.len() {
people[id].understand.insert(input[i].clone());
}
}
people
}

The record_people function records each line of input, then converts them into Person structs and stores them in the people array, allowing direct access to each person via their ID later. The build_language_dict function creates a HashMap with String keys and usize values, enabling us to find the corresponding value by language name. After converting all strings to IDs, we can begin constructing the graph.

Step 2: Construct the Language Conversion Graph

For each person, they can convert the languages they understand to the language they speak. This is equivalent to drawing a directed edge from the understood language to the spoken language. By traversing each person in this way, we can build a graph of language conversion.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
fn build_lang_graph(people: &Vec<Person>, lang_dict: &HashMap<&String, usize>) -> Vec<Vec<usize>> {
let mut lang_graph: Vec<Vec<usize>> = vec![vec![]; lang_dict.len()];
for person in people {
for lang in person.understand.borrow() {
if *lang == person.speak {
continue;
}
lang_graph[lang_dict[lang]].push(lang_dict[&person.speak]);
}
}
for i in 0..lang_graph.len() {
let tmp: HashSet<_> = lang_graph[i].drain(..).collect();
lang_graph[i].extend(tmp.into_iter());
}
lang_graph
}

The function’s end uses a HashSet for edge deduplication. Values in the array are placed into a HashSet and then back into the array to achieve deduplication.

Step 3: Verify Two People Can Communicate

After the language graph is established, a function is needed to verify whether two people can understand each other’s spoken languages. If one person’s spoken language can be converted into one of the languages the other person understands, then it can be said that the second person can understand the first person’s speech, and vice versa. This is implemented using a naive DFS algorithm.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
fn test_connected_dfs(
cur: usize,
target: &HashSet<usize>,
lang_graph: &Vec<Vec<usize>>,
visited: &mut Vec<bool>,
) -> bool {
if target.contains(&cur) {
return true;
}
visited[cur] = true;
for next in &lang_graph[cur] {
if visited[*next] {
continue;
}
if test_connected_dfs(*next, target, lang_graph, visited) {
return true;
}
}
false
}
fn test_biconnected(
a: &Person,
b: &Person,
lang_graph: &Vec<Vec<usize>>,
lang_dict: &HashMap<&String, usize>,
) -> bool {
let mut visited = vec![false; lang_graph.len()];
let target: HashSet<usize> = b
.understand
.borrow()
.into_iter()
.map(|s| lang_dict[s])
.collect();
if !test_connected_dfs(lang_dict[&a.speak], &target, lang_graph, &mut visited) {


return false;
}
visited = vec![false; lang_graph.len()];
let target: HashSet<usize> = a
.understand
.borrow()
.into_iter()
.map(|s| lang_dict[s])
.collect();
if !test_connected_dfs(lang_dict[&b.speak], &target, lang_graph, &mut visited) {
return false;
}
true
}

The test_biconnected function encapsulates test_connected_dfs. By calling this subfunction twice, it verifies whether two people can communicate.

Step 4: Construct a Disjoint Set Union

When analyzing the relationships between each person, a disjoint set union (DSU) data structure is used for organization. Everyone in a DSU can communicate with each other, and if two people from different DSUs can communicate, those DSUs are merged.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
struct DisjointSetUnion {
fa: Vec<usize>,
}
impl DisjointSetUnion {
fn get_fa(&mut self, cur: usize) -> usize {
if self.fa[cur] == cur {
return cur;
}
let tmp = self.fa[cur];
self.fa[cur] = self.get_fa(tmp);
self.fa[cur]
}
}
fn build_people_disjoint_set_union(
people: &Vec<Person>,
lang_graph: &Vec<Vec<usize>>,
lang_dict: &HashMap<&String, usize>,
) -> DisjointSetUnion {
let mut people_disjoint_set_union = DisjointSetUnion {
fa: (0..people.len()).into_iter().collect(),
};
for i in 0..people.len() {
for j in i + 1..people.len() {
if people_disjoint_set_union.get_fa(i) == people_disjoint_set_union.get_fa(j)
|| !test_biconnected(&people[i], &people[j], lang_graph, lang_dict)
{
continue;
}
let tmp = people_disjoint_set_union.get_fa(i);
people_disjoint_set_union.fa[tmp] = people_disjoint_set_union.get_fa(j);
}
}
people_disjoint_set_union
}

Path compression is used in the DSU implementation. Each call to get_fa updates the node’s parent to the root node of the set, thereby flattening the DSU and optimizing query efficiency.

Step 5: Obtain the Largest DSU Count

An array records the count of each DSU, and each node’s DSU is identified using get_fa, updating the count accordingly. The function returns the count of the largest DSU.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
impl DisjointSetUnion {
fn get_fa(&mut self, cur: usize) -> usize {
if self.fa[cur] == cur {
return cur;
}
let tmp = self.fa[cur];
self.fa[cur] = self.get_fa(tmp);
self.fa[cur]
}
fn get_largest_set(&mut self) -> usize {
let mut num_of_set = vec![0usize; self.fa.len()];
for i in 0..self.fa.len() {
num_of_set[self.get_fa(i)] += 1;
}
*num_of_set.iter().max().unwrap()
}
}

Step 6: Integrate and Return the Answer

Arrange all subfunctions in the main function and print the value of n - people_disjoint_set_union.get_largest_set() as the result.

1
2
3
4
5
6
7
8
9
fn main() {
let n = read_line_of_i32()[0] as usize;
let people = record_people(n);
let lang_dict = build_language_dict(&people);
let lang_graph = build_lang_graph(&people, &lang_dict);
let mut people_disjoint_set_union =
build_people_disjoint_set_union(&people, &lang_graph, &lang_dict);
println!("{}", n - people_disjoint_set_union.get_largest_set());
}

Conclusion: Complete Code

The provided code demonstrates a method for solving the problem by dividing people into sets based on language understanding and speaking, using data structures and algorithms such as hash maps, graphs, DFS, and disjoint set unions to efficiently find the minimum number of people that need to be excluded to allow for universal communication within the remaining group.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
use std::{
borrow::Borrow,
collections::{HashMap, HashSet},
};
struct Person {
speak: String,
understand: HashSet<String>,
}
struct DisjointSetUnion {
fa: Vec<usize>,
}
impl DisjointSetUnion {
fn get_fa(&mut self, cur: usize) -> usize {
if self.fa[cur] == cur {
return cur;
}
let tmp = self.fa[cur];
self.fa[cur] = self.get_fa(tmp);
self.fa[cur]
}
fn get_largest_set(&mut self) -> usize {
let mut num_of_set = vec![0usize; self.fa.len()];
for i in 0..self.fa.len() {
num_of_set[self.get_fa(i)] += 1;
}
*num_of_set.iter().max().unwrap()
}
}
fn main() {
let n = read_line_of_i32()[0] as usize;
let people = record_people(n);
let lang_dict = build_language_dict(&people);
let lang_graph = build_lang_graph(&people, &lang_dict);
let mut people_disjoint_set_union =
build_people_disjoint_set_union(&people, &lang_graph, &lang_dict);
println!("{}", n - people_disjoint_set_union.get_largest_set());
}
fn build_people_disjoint_set_union(
people: &Vec<Person>,
lang_graph: &Vec<Vec<usize>>,
lang_dict: &HashMap<&String, usize>,
) -> DisjointSetUnion {
let mut people_disjoint_set_union = DisjointSetUnion {
fa: (0..people.len()).into_iter().collect(),
};
for i in 0..people.len() {
for j in i + 1..people.len() {
if people_disjoint_set_union.get_fa(i) == people_disjoint_set_union.get_fa(j)
|| !test_biconnected(&people[i], &people[j], lang_graph, lang_dict)
{
continue;
}
let tmp = people_disjoint_set_union.get_fa(i);
people_disjoint_set_union.fa[tmp] = people_disjoint_set_union.get_fa(j);
}
}
people_disjoint_set_union
}
fn test_connected_dfs(
cur: usize,
target: &HashSet<usize>,
lang_graph: &Vec<Vec<usize>>,
visited: &mut Vec<bool>,
) -> bool {
if target.contains(&cur) {
return true;
}
visited[cur] = true;
for next in &lang_graph[cur] {
if visited[*next] {
continue;
}
if test_connected_dfs(*next, target, lang_graph, visited) {
return true;
}
}
false
}
fn test_biconnected(
a: &Person,
b: &Person,
lang_graph: &Vec<Vec<usize>>,
lang_dict: &HashMap<&String, usize>,
) -> bool {
let mut visited = vec![false; lang_graph.len()];
let target: HashSet<usize> = b
.understand
.borrow()
.into_iter()
.map(|s| lang_dict[s])
.collect();
if !test_connected_dfs(lang_dict[&a.speak], &target, lang_graph, &mut visited) {
return false;
}
visited = vec![false; lang_graph.len()];
let target: HashSet<usize> = a
.understand
.borrow()
.into_iter()
.map(|s| lang_dict[s])
.collect();
if !test_connected_dfs(lang_dict[&b.speak], &target, lang_graph, &mut visited) {
return false;
}
true
}
fn build_lang_graph(people: &Vec<Person>, lang_dict: &HashMap<&String, usize>) -> Vec<Vec<usize>> {
let mut lang_graph: Vec<Vec<usize>> = vec![vec![]; lang_dict.len()];
for person in people {
for lang in person.understand.borrow() {
if *lang == person.speak {
continue;
}
lang_graph[lang_dict[lang]].push(lang_dict[&person.speak]);
}
}
for i in 0..lang_graph.len() {
let tmp: HashSet<_> = lang_graph[i].drain(..).collect();
lang_graph[i].extend(tmp.into_iter());
}
lang_graph
}
fn build_language_dict(people: &Vec<Person>) -> HashMap<&String, usize> {
let mut lang_dict = HashMap::new();
let mut num_of_lang = 0usize;
for person in people {
for lang in person.understand.borrow() {
if lang_dict.contains_key(lang) {
continue;
}
lang_dict.insert(lang, num_of_lang);
num_of_lang += 1;
}
}
lang_dict
}
fn record_people(n: usize) -> Vec<Person> {
let mut people: Vec<Person> = vec![];
for _ in 0..n {
let input = read_line_of_string();
// println!("{:?}", input);
people.push(Person {
speak: input[1].to_owned(),
understand: HashSet::new(),
});
let id = people.len() - 1;
for i in 1..input.len() {
people[id].understand.insert(input[i].clone());
}
}
people
}
fn read_line_of_i32() -> Vec<i32> {
let mut buffer = String::new();
let _ = std::io::stdin().read_line(&mut buffer);
buffer
.trim()
.to_owned()
.split(" ")
.map(|x| x.to_owned().parse::<i32>().unwrap())
.collect()
}
fn read_line_of_string() -> Vec<String> {
let mut buffer = String::new();
let _ = std::io::stdin().read_line(&mut buffer);
buffer
.trim()
.to_owned()
.split(" ")
.map(|x| x.to_owned())
.collect()
}