AI 물리해석을 믿으려면 무엇을 검증해야 하나

AI CAE가 실제 제품 결정에 들어가려면 “빠르다”보다 “어디까지 믿을 수 있는가”가 먼저입니다. Surrogate model은 평균 오차가 낮아 보여도 특정 operating region, boundary condition, geometry class 밖에서는 급격히 틀릴 수 있습니다.

전통 CAE에서도 verification and validation은 핵심입니다. ASME V&V 40은 computational model의 credibility가 그 모델을 의사결정 evidence로 얼마나 의존하는지, 그리고 그 결정이 틀렸을 때 consequence가 얼마나 큰지에 맞춰져야 한다고 설명합니다. AI surrogate에는 이 원칙이 더 강하게 필요합니다.

Verification과 validation은 다릅니다

Verification은 모델이 의도한 수학/소프트웨어 구현대로 작동하는지 확인하는 것입니다. Validation은 그 모델이 실제 세계를 intended use 안에서 충분히 정확하게 대표하는지 확인하는 것입니다. AI surrogate에서는 여기에 data validation과 deployment validation이 추가됩니다.

Code verification: inference pipeline, normalization, geometry preprocessing, unit conversion이 올바른가.
Model verification: benchmark case에서 reference solver와 일관된가.
Data validation: training data가 intended use의 geometry, mesh, operating condition을 충분히 포함하는가.
Deployment validation: 실제 simulator나 product workflow 안에 넣었을 때 오류가 증폭되지 않는가.

Stand-alone accuracy는 충분하지 않습니다

2026년 power-system surrogate V&V 논문은 중요한 경고를 합니다. Stand-alone component model에서 surrogate가 평균적으로 잘 맞아도, differential-algebraic simulator 안에 삽입되면 coupling sensitivity와 dynamic error amplification 때문에 trajectory error가 커질 수 있습니다. 논문은 stressed operating region에서 discrepancy가 집중되고, 작은 equation residual이 작은 state-trajectory error를 보장하지 않는다고 보고합니다.

이 메시지는 구조, 열, 유동 해석에도 그대로 적용됩니다. 단일 field error가 낮아도 설계 판단에 중요한 quantity of interest, 예를 들어 최대 온도 위치, 압력 강하, 응력 집중, safety margin, drag coefficient가 틀리면 제품 결정은 잘못될 수 있습니다.

AI surrogate evidence package

AI CAE를 제품 결정에 쓰려면 최소한 다음 evidence가 필요합니다.

Intended use: screening용인지, 설계 최적화용인지, report evidence인지 명확히 구분합니다.
Domain of applicability: geometry family, Reynolds number, material range, load case, mesh/data format 한계를 적습니다.
Reference baseline: 어떤 solver, turbulence model, material model, 실험 데이터와 비교했는지 남깁니다.
Metrics: 평균 오차뿐 아니라 QoI error, worst-case error, spatial localization, threshold crossing을 봅니다.
Uncertainty: confidence, conformal calibration, ensemble spread, out-of-distribution indicator가 필요합니다.
Fallback policy: 신뢰 범위를 벗어나면 full solver, 실험, human review로 돌아가는 기준이 있어야 합니다.

Simulation governance 관점

AI가 들어오면 simulation governance는 더 중요해집니다. 기존 solver는 mesh, boundary condition, material card, solver setting을 추적했습니다. AI surrogate는 여기에 training dataset version, preprocessing, model checkpoint, inference configuration, calibration data, benchmark report까지 추적해야 합니다.

실무형 AI CAE 제품은 contour 이미지를 먼저 앞세우기보다 “이 결과가 어떤 데이터와 조건에서 나온 것인지”, “어떤 범위에서 검증되었는지”, “어디서는 full solver로 넘어가야 하는지”를 표시해야 합니다.

V&V gate 예시

Gate	질문	필요 evidence	통과하지 못하면
G0 Data	학습 데이터가 intended use를 덮는가?	geometry/material/load range, OOD split, missing regime log	모델 사용 범위를 줄이거나 data acquisition으로 되돌림.
G1 Solver label	reference label을 믿을 수 있는가?	solver version, mesh convergence, residual, validation case	surrogate 이전에 CFD/FEA label 재생성.
G2 Model	QoI와 field가 모두 맞는가?	field error, max temperature/drag/stress error, worst-case case list	평균 오차가 낮아도 design screening 전용으로 제한.
G3 Deployment	제품 워크플로에서 오류가 증폭되지 않는가?	closed-loop test, fallback trigger, human review policy	full solver fallback 또는 승인 단계 추가.

RHX 관점

RHXY Sim에서 AI를 쓴다면 결과를 단정적으로 말하기보다 evidence layer를 함께 보여주는 것이 중요합니다. 입력 조건, solver/surrogate 구분, 적용 범위, error band, 다음 검토 액션이 한 화면에 있어야 합니다. 그래야 비전문가도 결과를 빠르게 이해하면서, 엔지니어는 신뢰 한계를 확인할 수 있습니다.

AI 물리해석을 믿으려면 무엇을 검증해야 하나

읽기 전에 보는 검토 지도

Question

Inputs

Gate

Output