Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: handle aggregation queries sequentially follow up #8450

Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 28 additions & 10 deletions src/lib/metrics-gauge.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
import type { Logger } from './logger';
import type { IUnleashConfig } from './types';
import { createGauge, type Gauge } from './util/metrics';

type RestrictedRecord<T extends readonly string[]> = Record<T[number], string>;
type Query<R> = () => Promise<R | undefined | null>;
type MapResult<R> = (result: R) => {
type MetricValue<R> = {
count: number;
labels: RestrictedRecord<GaugeDefinition<R>['labelNames']>;
};
type MapResult<R> = (result: R) => MetricValue<R> | MetricValue<R>[];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single task can output different counts for different labels. This is useful for cases like client_apps_total at line 40


type GaugeDefinition<T> = {
name: string;
Expand All @@ -19,20 +22,35 @@ type Task = () => Promise<void>;
export class DbMetricsMonitor {
private tasks: Set<Task> = new Set();
private gauges: Map<string, Gauge<string>> = new Map();
private logger: Logger;

constructor() {}
constructor(config: IUnleashConfig) {
this.logger = config.getLogger('gauge-metrics');
}

private asArray<T>(value: T | T[]): T[] {
return Array.isArray(value) ? value : [value];
}

registerGaugeDbMetric<T>(definition: GaugeDefinition<T>) {
registerGaugeDbMetric<T>(definition: GaugeDefinition<T>): Task {
const gauge = createGauge(definition);
this.gauges.set(definition.name, gauge);
this.tasks.add(async () => {
const result = await definition.query();
if (result) {
const { count, labels } = definition.map(result);
gauge.reset();
gauge.labels(labels).set(count);
const task = async () => {
try {
const result = await definition.query();
if (result !== null && result !== undefined) {
const results = this.asArray(definition.map(result));
gauge.reset();
for (const r of results) {
gauge.labels(r.labels).set(r.count);
}
}
} catch (e) {
this.logger.warn(`Failed to refresh ${definition.name}`, e);
}
});
};
this.tasks.add(task);
return task;
}

refreshDbMetrics = async () => {
Expand Down
99 changes: 48 additions & 51 deletions src/lib/metrics.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ export default class MetricsMonitor {

const { eventStore, environmentStore } = stores;
const { flagResolver } = config;
const dbMetrics = new DbMetricsMonitor();
const dbMetrics = new DbMetricsMonitor(config);

const cachedEnvironments: () => Promise<IEnvironment[]> = memoizee(
async () => environmentStore.getAll(),
Expand Down Expand Up @@ -117,15 +117,29 @@ export default class MetricsMonitor {
help: 'Number of times a feature flag has been used',
labelNames: ['toggle', 'active', 'appName'],
});
const featureFlagsTotal = createGauge({

// schedule and execute immediately
await dbMetrics.registerGaugeDbMetric({
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tasks need to execute immediately because the tests expect that. I'd argue that some of these can be delayed, but maybe they are important because of other components depending on these. So I'm delaying this decision until we migrated all gauges to dbMetrics

name: 'feature_toggles_total',
help: 'Number of feature flags',
labelNames: ['version'],
});
const maxFeatureEnvironmentStrategies = createGauge({
query: () => instanceStatsService.getToggleCount(),
map: (count) => ({ count, labels: { version } }),
})();

dbMetrics.registerGaugeDbMetric({
name: 'max_feature_environment_strategies',
help: 'Maximum number of environment strategies in one feature',
labelNames: ['feature', 'environment'],
query: () =>
stores.featureStrategiesReadModel.getMaxFeatureEnvironmentStrategies(),
map: (result) => ({
count: result.count,
labels: {
environment: result.environment,
feature: result.feature,
},
}),
});

dbMetrics.registerGaugeDbMetric({
Expand Down Expand Up @@ -246,11 +260,18 @@ export default class MetricsMonitor {
help: 'Number of strategies',
});

const clientAppsTotal = createGauge({
// execute immediately to get initial values
await dbMetrics.registerGaugeDbMetric({
name: 'client_apps_total',
help: 'Number of registered client apps aggregated by range by last seen',
labelNames: ['range'],
});
query: () => instanceStatsService.getLabeledAppCounts(),
map: (result) =>
Object.entries(result).map(([range, count]) => ({
count,
labels: { range },
})),
})();

const samlEnabled = createGauge({
name: 'saml_enabled',
Expand Down Expand Up @@ -408,7 +429,6 @@ export default class MetricsMonitor {

const stats = await instanceStatsService.getStats();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getStats was triggering many queries in parallel, some of which were not used in prometheus.

const [
maxEnvironmentStrategies,
maxConstraintValuesResult,
maxConstraintsPerStrategyResult,
stageCountByProjectResult,
Expand All @@ -419,7 +439,6 @@ export default class MetricsMonitor {
instanceOnboardingMetrics,
projectsOnboardingMetrics,
] = await Promise.all([
stores.featureStrategiesReadModel.getMaxFeatureEnvironmentStrategies(),
stores.featureStrategiesReadModel.getMaxConstraintValues(),
stores.featureStrategiesReadModel.getMaxConstraintsPerStrategy(),
stores.featureLifecycleReadModel.getStageCountByProject(),
Expand All @@ -439,9 +458,6 @@ export default class MetricsMonitor {
: Promise.resolve([]),
]);

featureFlagsTotal.reset();
featureFlagsTotal.labels({ version }).set(stats.featureToggles);

featureTogglesArchivedTotal.reset();
featureTogglesArchivedTotal.set(stats.archivedFeatureToggles);

Expand All @@ -460,30 +476,6 @@ export default class MetricsMonitor {
.set(stage.duration);
});

eventBus.on(
events.STAGE_ENTERED,
(entered: { stage: string; feature: string }) => {
if (flagResolver.isEnabled('trackLifecycleMetrics')) {
logger.info(
`STAGE_ENTERED listened ${JSON.stringify(entered)}`,
);
}
featureLifecycleStageEnteredCounter.increment({
stage: entered.stage,
});
},
);

eventBus.on(
events.EXCEEDS_LIMIT,
({
resource,
limit,
}: { resource: string; limit: number }) => {
exceedsLimitErrorCounter.increment({ resource, limit });
},
);
Comment on lines -463 to -485
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved up to line 485 on the right side. Now lives inside registerPrometheusMetrics instead of startMonitoring


featureLifecycleStageCountByProject.reset();
stageCountByProjectResult.forEach((stageResult) =>
featureLifecycleStageCountByProject
Expand Down Expand Up @@ -512,16 +504,6 @@ export default class MetricsMonitor {
legacyTokensActive.reset();
legacyTokensActive.set(deprecatedTokens.activeLegacyTokens);

if (maxEnvironmentStrategies) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got migrated to: https://github.com/Unleash/unleash/pull/8450/files#diff-0610ce1595d1dcf150ca2efb2f0c5806d082f5bc6b8429a437970c7d60171cbdR181-R194 where both the gauge definition and refresh query are together

maxFeatureEnvironmentStrategies.reset();
maxFeatureEnvironmentStrategies
.labels({
environment: maxEnvironmentStrategies.environment,
feature: maxEnvironmentStrategies.feature,
})
.set(maxEnvironmentStrategies.count);
}

if (maxConstraintValuesResult) {
maxConstraintValues.reset();
maxConstraintValues
Expand Down Expand Up @@ -653,11 +635,6 @@ export default class MetricsMonitor {
oidcEnabled.reset();
oidcEnabled.set(stats.OIDCenabled ? 1 : 0);

clientAppsTotal.reset();
stats.clientApps.forEach(({ range, count }) =>
clientAppsTotal.labels({ range }).set(count),
);

rateLimits.reset();
rateLimits
.labels({
Expand Down Expand Up @@ -720,7 +697,27 @@ export default class MetricsMonitor {
collectStaticCounters.bind(this),
hoursToMilliseconds(2),
'collectStaticCounters',
0, // no jitter
);

eventBus.on(
events.EXCEEDS_LIMIT,
({ resource, limit }: { resource: string; limit: number }) => {
exceedsLimitErrorCounter.increment({ resource, limit });
},
);

eventBus.on(
events.STAGE_ENTERED,
(entered: { stage: string; feature: string }) => {
if (flagResolver.isEnabled('trackLifecycleMetrics')) {
logger.info(
`STAGE_ENTERED listened ${JSON.stringify(entered)}`,
);
}
featureLifecycleStageEnteredCounter.increment({
stage: entered.stage,
});
},
);

eventBus.on(
Expand Down
Loading