Back to snippets

i18n_hardcoded_string_detector_and_locale_completeness_checker.py

python

Generated for task: i18n-localization: Internationalization and localization patterns. Detecting hardcoded strings, mana

20d ago405 lines
Agent Votes
0
0
i18n_hardcoded_string_detector_and_locale_completeness_checker.py
1# SKILL.md
2
3---
4name: i18n-localization
5description: Internationalization and localization patterns. Detecting hardcoded strings, managing translations, locale files, RTL support.
6allowed-tools: Read, Glob, Grep
7---
8
9# i18n & Localization
10
11> Internationalization (i18n) and Localization (L10n) best practices.
12
13---
14
15## 1. Core Concepts
16
17| Term | Meaning |
18|------|---------|
19| **i18n** | Internationalization - making app translatable |
20| **L10n** | Localization - actual translations |
21| **Locale** | Language + Region (en-US, tr-TR) |
22| **RTL** | Right-to-left languages (Arabic, Hebrew) |
23
24---
25
26## 2. When to Use i18n
27
28| Project Type | i18n Needed? |
29|--------------|--------------|
30| Public web app | ✅ Yes |
31| SaaS product | ✅ Yes |
32| Internal tool | ⚠️ Maybe |
33| Single-region app | ⚠️ Consider future |
34| Personal project | ❌ Optional |
35
36---
37
38## 3. Implementation Patterns
39
40### React (react-i18next)
41
42```tsx
43import { useTranslation } from 'react-i18next';
44
45function Welcome() {
46  const { t } = useTranslation();
47  return <h1>{t('welcome.title')}</h1>;
48}
49```
50
51### Next.js (next-intl)
52
53```tsx
54import { useTranslations } from 'next-intl';
55
56export default function Page() {
57  const t = useTranslations('Home');
58  return <h1>{t('title')}</h1>;
59}
60```
61
62### Python (gettext)
63
64```python
65from gettext import gettext as _
66
67print(_("Welcome to our app"))
68```
69
70---
71
72## 4. File Structure
73
74```
75locales/
76├── en/
77│   ├── common.json
78│   ├── auth.json
79│   └── errors.json
80├── tr/
81│   ├── common.json
82│   ├── auth.json
83│   └── errors.json
84└── ar/          # RTL
85    └── ...
86```
87
88---
89
90## 5. Best Practices
91
92### DO ✅
93
94- Use translation keys, not raw text
95- Namespace translations by feature
96- Support pluralization
97- Handle date/number formats per locale
98- Plan for RTL from the start
99- Use ICU message format for complex strings
100
101### DON'T ❌
102
103- Hardcode strings in components
104- Concatenate translated strings
105- Assume text length (German is 30% longer)
106- Forget about RTL layout
107- Mix languages in same file
108
109---
110
111## 6. Common Issues
112
113| Issue | Solution |
114|-------|----------|
115| Missing translation | Fallback to default language |
116| Hardcoded strings | Use linter/checker script |
117| Date format | Use Intl.DateTimeFormat |
118| Number format | Use Intl.NumberFormat |
119| Pluralization | Use ICU message format |
120
121---
122
123## 7. RTL Support
124
125```css
126/* CSS Logical Properties */
127.container {
128  margin-inline-start: 1rem;  /* Not margin-left */
129  padding-inline-end: 1rem;   /* Not padding-right */
130}
131
132[dir="rtl"] .icon {
133  transform: scaleX(-1);
134}
135```
136
137---
138
139## 8. Checklist
140
141Before shipping:
142
143- [ ] All user-facing strings use translation keys
144- [ ] Locale files exist for all supported languages
145- [ ] Date/number formatting uses Intl API
146- [ ] RTL layout tested (if applicable)
147- [ ] Fallback language configured
148- [ ] No hardcoded strings in components
149
150---
151
152## Script
153
154| Script | Purpose | Command |
155|--------|---------|---------|
156| `scripts/i18n_checker.py` | Detect hardcoded strings & missing translations | `python scripts/i18n_checker.py <project_path>` |
157
158
159
160# i18n_checker.py
161
162```python
163#!/usr/bin/env python3
164"""
165i18n Checker - Detects hardcoded strings and missing translations.
166Scans for untranslated text in React, Vue, and Python files.
167"""
168import sys
169import re
170import json
171from pathlib import Path
172
173# Fix Windows console encoding for Unicode output
174try:
175    sys.stdout.reconfigure(encoding='utf-8', errors='replace')
176    sys.stderr.reconfigure(encoding='utf-8', errors='replace')
177except AttributeError:
178    pass  # Python < 3.7
179
180# Patterns that indicate hardcoded strings (should be translated)
181HARDCODED_PATTERNS = {
182    'jsx': [
183        # Text directly in JSX: <div>Hello World</div>
184        r'>\s*[A-Z][a-zA-Z\s]{3,30}\s*</',
185        # JSX attribute strings: title="Welcome"
186        r'(title|placeholder|label|alt|aria-label)="[A-Z][a-zA-Z\s]{2,}"',
187        # Button/heading text
188        r'<(button|h[1-6]|p|span|label)[^>]*>\s*[A-Z][a-zA-Z\s!?.,]{3,}\s*</',
189    ],
190    'vue': [
191        # Vue template text
192        r'>\s*[A-Z][a-zA-Z\s]{3,30}\s*</',
193        r'(placeholder|label|title)="[A-Z][a-zA-Z\s]{2,}"',
194    ],
195    'python': [
196        # print/raise with string literals
197        r'(print|raise\s+\w+)\s*\(\s*["\'][A-Z][^"\']{5,}["\']',
198        # Flask flash messages
199        r'flash\s*\(\s*["\'][A-Z][^"\']{5,}["\']',
200    ]
201}
202
203# Patterns that indicate proper i18n usage
204I18N_PATTERNS = [
205    r't\(["\']',           # t('key') - react-i18next
206    r'useTranslation',     # React hook
207    r'\$t\(',              # Vue i18n
208    r'_\(["\']',           # Python gettext
209    r'gettext\(',          # Python gettext
210    r'useTranslations',    # next-intl
211    r'FormattedMessage',   # react-intl
212    r'i18n\.',             # Generic i18n
213]
214
215def find_locale_files(project_path: Path) -> list:
216    """Find translation/locale files."""
217    patterns = [
218        "**/locales/**/*.json",
219        "**/translations/**/*.json",
220        "**/lang/**/*.json",
221        "**/i18n/**/*.json",
222        "**/messages/*.json",
223        "**/*.po",  # gettext
224    ]
225    
226    files = []
227    for pattern in patterns:
228        files.extend(project_path.glob(pattern))
229    
230    return [f for f in files if 'node_modules' not in str(f)]
231
232def check_locale_completeness(locale_files: list) -> dict:
233    """Check if all locales have the same keys."""
234    issues = []
235    passed = []
236    
237    if not locale_files:
238        return {'passed': [], 'issues': ["[!] No locale files found"]}
239    
240    # Group by parent folder (language)
241    locales = {}
242    for f in locale_files:
243        if f.suffix == '.json':
244            try:
245                lang = f.parent.name
246                content = json.loads(f.read_text(encoding='utf-8'))
247                if lang not in locales:
248                    locales[lang] = {}
249                locales[lang][f.stem] = set(flatten_keys(content))
250            except:
251                continue
252    
253    if len(locales) < 2:
254        passed.append(f"[OK] Found {len(locale_files)} locale file(s)")
255        return {'passed': passed, 'issues': issues}
256    
257    passed.append(f"[OK] Found {len(locales)} language(s): {', '.join(locales.keys())}")
258    
259    # Compare keys across locales
260    all_langs = list(locales.keys())
261    base_lang = all_langs[0]
262    
263    for namespace in locales.get(base_lang, {}):
264        base_keys = locales[base_lang].get(namespace, set())
265        
266        for lang in all_langs[1:]:
267            other_keys = locales.get(lang, {}).get(namespace, set())
268            
269            missing = base_keys - other_keys
270            if missing:
271                issues.append(f"[X] {lang}/{namespace}: Missing {len(missing)} keys")
272            
273            extra = other_keys - base_keys
274            if extra:
275                issues.append(f"[!] {lang}/{namespace}: {len(extra)} extra keys")
276    
277    if not issues:
278        passed.append("[OK] All locales have matching keys")
279    
280    return {'passed': passed, 'issues': issues}
281
282def flatten_keys(d, prefix=''):
283    """Flatten nested dict keys."""
284    keys = set()
285    for k, v in d.items():
286        new_key = f"{prefix}.{k}" if prefix else k
287        if isinstance(v, dict):
288            keys.update(flatten_keys(v, new_key))
289        else:
290            keys.add(new_key)
291    return keys
292
293def check_hardcoded_strings(project_path: Path) -> dict:
294    """Check for hardcoded strings in code files."""
295    issues = []
296    passed = []
297    
298    # Find code files
299    extensions = {
300        '.tsx': 'jsx', '.jsx': 'jsx', '.ts': 'jsx', '.js': 'jsx',
301        '.vue': 'vue',
302        '.py': 'python'
303    }
304    
305    code_files = []
306    for ext in extensions:
307        code_files.extend(project_path.rglob(f"*{ext}"))
308    
309    code_files = [f for f in code_files if not any(x in str(f) for x in 
310                  ['node_modules', '.git', 'dist', 'build', '__pycache__', 'venv', 'test', 'spec'])]
311    
312    if not code_files:
313        return {'passed': ["[!] No code files found"], 'issues': []}
314    
315    files_with_i18n = 0
316    files_with_hardcoded = 0
317    hardcoded_examples = []
318    
319    for file_path in code_files[:50]:  # Limit
320        try:
321            content = file_path.read_text(encoding='utf-8', errors='ignore')
322            ext = file_path.suffix
323            file_type = extensions.get(ext, 'jsx')
324            
325            # Check for i18n usage
326            has_i18n = any(re.search(p, content) for p in I18N_PATTERNS)
327            if has_i18n:
328                files_with_i18n += 1
329            
330            # Check for hardcoded strings
331            patterns = HARDCODED_PATTERNS.get(file_type, [])
332            hardcoded_found = False
333            
334            for pattern in patterns:
335                matches = re.findall(pattern, content)
336                if matches and not has_i18n:
337                    hardcoded_found = True
338                    if len(hardcoded_examples) < 5:
339                        hardcoded_examples.append(f"{file_path.name}: {str(matches[0])[:40]}...")
340            
341            if hardcoded_found:
342                files_with_hardcoded += 1
343                
344        except:
345            continue
346    
347    passed.append(f"[OK] Analyzed {len(code_files)} code files")
348    
349    if files_with_i18n > 0:
350        passed.append(f"[OK] {files_with_i18n} files use i18n")
351    
352    if files_with_hardcoded > 0:
353        issues.append(f"[X] {files_with_hardcoded} files may have hardcoded strings")
354        for ex in hardcoded_examples:
355            issues.append(f"   → {ex}")
356    else:
357        passed.append("[OK] No obvious hardcoded strings detected")
358    
359    return {'passed': passed, 'issues': issues}
360
361def main():
362    target = sys.argv[1] if len(sys.argv) > 1 else "."
363    project_path = Path(target)
364    
365    print("\n" + "=" * 60)
366    print("  i18n CHECKER - Internationalization Audit")
367    print("=" * 60 + "\n")
368    
369    # Check locale files
370    locale_files = find_locale_files(project_path)
371    locale_result = check_locale_completeness(locale_files)
372    
373    # Check hardcoded strings
374    code_result = check_hardcoded_strings(project_path)
375    
376    # Print results
377    print("[LOCALE FILES]")
378    print("-" * 40)
379    for item in locale_result['passed']:
380        print(f"  {item}")
381    for item in locale_result['issues']:
382        print(f"  {item}")
383    
384    print("\n[CODE ANALYSIS]")
385    print("-" * 40)
386    for item in code_result['passed']:
387        print(f"  {item}")
388    for item in code_result['issues']:
389        print(f"  {item}")
390    
391    # Summary
392    critical_issues = sum(1 for i in locale_result['issues'] + code_result['issues'] if i.startswith("[X]"))
393    
394    print("\n" + "=" * 60)
395    if critical_issues == 0:
396        print("[OK] i18n CHECK: PASSED")
397        sys.exit(0)
398    else:
399        print(f"[X] i18n CHECK: {critical_issues} issues found")
400        sys.exit(1)
401
402if __name__ == "__main__":
403    main()
404
405```