Compare commits

..

17 Commits

Author SHA1 Message Date
Pierre Bourdon
c3b6e7b425 queue runner: fix nullptr deref on build exception after releasing a machine reservation 2025-02-16 13:27:26 +01:00
K900
c60e7955bf Add metric for builds waiting for download slot
(cherry picked from commit f23ec71227911891807706b6b978836e4d80edde)
2025-02-12 10:35:17 +01:00
Maximilian Bosch
90399cb674 readIntoSocket: fix with store URIs containing an &
The third argument to `open()` in `-|` mode is passed to a shell if it's
a string. In my case the store URI contains
`?secret-key=${signingKey.directory}/secret&compression=zstd`

For the `nix store cat` case this means that

* until `&` the process will be started in the background. This fails
  immediately because no path to cat is specified.
* `compression=zstd` is a variable assignment
* the `$path` argument to `store cat` is attempted to be executed as
  another command

Passing just the list solves the problem.

(cherry picked from commit 3ee51dbe589458cc54ff753317bbc6db530bddc0)
2025-02-12 10:35:17 +01:00
git@71rd.net
d6a3ef484c Stream files from store instead of buffering them
When an artifact is requested from hydra the output is first copied
from the nix store into memory and then sent as a response, delaying
the download and taking up significant amounts of memory.

As reported in https://github.com/NixOS/hydra/issues/1357

Instead of calling a command and blocking while reading in the entire
output, this adds read_into_socket(). the function takes a
command, starting a subprocess with that command, returning a file
descriptor attached to stdout.
This file descriptor is then by responsebuilder of Catalyst to steam
the output directly

(cherry picked from commit 459aa0a5983a0bd546399c08231468d6e9282f54)
2025-02-12 10:35:17 +01:00
Pierre Bourdon
685857df2e web: replace 'errormsg' with 'errormsg IS NULL' in most cases
This is implement in an extremely hacky way due to poor DBIx feature
support. Ideally, what we'd need is a way to tell DBIx to ignore the
errormsg column unless explicitly requested, and to automatically add a
computed 'errormsg IS NULL' column in others. Since it does not support
that, this commit instead hacks some support via method overrides while
taking care to not break anything obvious.
2025-02-12 10:35:17 +01:00
ajs124
f2b6e9d8ab lazy-load evaluation errors
Closes #1362
2025-02-12 10:35:17 +01:00
Maximilian Bosch
b04847335f Only show stepname if it doesn't equal the name of the drv
When building e.g. nixpkgs, the "Running builds" view will mostly look
like this

    hello.x86_64-linux (Build of hello-X.Y)
    exa.x86_64-linux (Build of exa-X.Y)
    ...

This doesn't provide any useful information. Showing the step name only
makes sense if it's not a child of the job's derivation. With this
patch, that information will only be shown if the drv name (i.e. w/o
`/nix/store/` prefix, .drv ext & hash) is not equal to the drv name of
the job itself (build.nixname).
2025-02-12 10:35:17 +01:00
Maximilian Bosch
7e61000172 Running builds view: show build step names
When using Hydra to build machine configurations, you'll often see
"nixosConfigurations.foo" five times, i.e. for each build step being
run. This isn't very helpful I think because in such a case, a single
build step can also be compiling the Linux kernel.

This change also fetches the `drvpath` and `type` from the `buildsteps`
relation. We're already joining it, so this doesn't make much difference
(confirmed via query logging that this doesn't cause extra SQL queries).

Unfortunately build steps don't have a human readable name, so I'm
deriving it from the drvpath by stripping away the hash (assuming that
it'll never contain a `-` and that `/nix/store/` is used as prefix). I
decided against using the Nix bindings for that to avoid too much
overhead due to store operations for each build step.
2025-02-12 10:35:17 +01:00
Maximilian Bosch
6d04e824d5 Make "timed out" and "log limit exceeded" builds aborted
In 73694087a0 I gave builds that failed
because of a timeout or exceeded log limit a stop sign and I stand by
that reasoning: with that it's possible to distinguish between actual
build failures and rather transient things such as timeouts.

Back then I considered it a feature that these are shown in a different
tab, but I don't think that's a good idea anymore. When using a jobset to
e.g. track the regressions from a mass rebuild (like a compiler or gcc
update), "Newly failed builds" should exclusively display regressions (and
flaky builds of course, not much I can do about that).

Also, when a bunch of builds fail in such a jobset because of e.g. a
broken connection to a builder that results in a timeout, I want to be
able to restart them all w/o rebuilding actual regressions.

To make it clear that we not only have "Aborted" builds in the tab, I
renamed the label to "Aborted / Timed out".
2025-02-12 10:35:16 +01:00
Pierre Bourdon
36e25d8fd2 queue-runner: release machine reservation while copying outputs
This allows for better builder usage when the queue runner is busy. To
avoid running into uncontrollable imbalances between builder/queue
runner, we only release the machine reservation after the local
throttler has found a slot to start copying the outputs for that build.
2025-02-12 10:35:16 +01:00
Pierre Bourdon
d4e273f7b1 queue-runner: switch to pseudorandom ordering of builds processing
We don't rely on sequential / monotonic build IDs processing anymore, so
randomizing actually has the advantage of mixing builds for different
systems together, to avoid only one chunk of builds for a single system
getting processed while builders for other systems are starved.
2025-02-12 10:35:16 +01:00
Pierre Bourdon
23366ec10d queue runner: introduce some parallelism for remote paths lookup
Each output for a given step being ingested is looked up in parallel,
which should basically multiply the speed of builds ingestion by the
average number of outputs per derivation.
2025-02-12 10:35:16 +01:00
Pierre Bourdon
98759e4ff9 queue-runner: reduce the time between queue monitor restarts
This will induce more DB queries (though these are fairly cheap), but at
the benefit of processing bumps within 1m instead of within 10m.
2025-02-12 10:35:16 +01:00
Pierre Bourdon
d850c99883 queue-runner: remove id > X from new builds query
Running the query with/without it shows that it makes no difference to
postgres, since there's an index on finished=0 already. This allows a
few simplifications, but also paves the way towards running multiple
parallel monitor threads in the future.
2025-02-12 10:35:16 +01:00
Pierre Bourdon
1b76eec4e8 queue-runner: add prom metrics to allow detecting internal bottlenecks
By looking at the ratio of running vs. waiting for the dispatcher and
the queue monitor, we should get better visibility into what hydra is
currently bottlenecked on.

There are other side effects we can try to measure to get to the same
result, but having a simple way doesn't cost us much.
2025-02-12 10:35:15 +01:00
Pierre Bourdon
3bb1a61a7d web: include current step status on /machines 2025-02-12 10:35:15 +01:00
Pierre Bourdon
3b9045f60d queue-runner: limit parallelism of CPU intensive operations
My current theory is that running more parallel xz than available CPU
cores is reducing our overall throughput by requiring more scheduling
overhead and more cache thrashing.
2025-02-12 10:35:15 +01:00
41 changed files with 410 additions and 644 deletions

View File

@@ -1,10 +1,7 @@
name: "Test"
on:
pull_request:
merge_group:
push:
branches:
- master
jobs:
tests:
runs-on: ubuntu-latest

1
.gitignore vendored
View File

@@ -3,7 +3,6 @@
/src/sql/hydra-postgresql.sql
/src/sql/hydra-sqlite.sql
/src/sql/tmp.sqlite
.hydra-data
result
result-*
outputs

View File

@@ -72,16 +72,17 @@ Make sure **State** at the top of the page is set to "_Enabled_" and click on "_
You can build Hydra via `nix-build` using the provided [default.nix](./default.nix):
```
$ nix build
$ nix-build
```
### Development Environment
You can use the provided shell.nix to get a working development environment:
```
$ nix develop
$ mesonConfigurePhase
$ ninja
$ nix-shell
$ autoreconfPhase
$ configurePhase # NOTE: not ./configure
$ make
```
### Executing Hydra During Development
@@ -90,9 +91,9 @@ When working on new features or bug fixes you need to be able to run Hydra from
can be done using [foreman](https://github.com/ddollar/foreman):
```
$ nix develop
$ nix-shell
$ # hack hack
$ ninja -C build
$ make
$ foreman start
```
@@ -114,24 +115,22 @@ Start by following the steps in [Development Environment](#development-environme
Then, you can run the tests and the perlcritic linter together with:
```console
$ nix develop
$ ninja -C build test
$ nix-shell
$ make check
```
You can run a single test with:
```
$ nix develop
$ cd build
$ meson test --test-args=../t/Hydra/Event.t testsuite
$ nix-shell
$ yath test ./t/foo/bar.t
```
And you can run just perlcritic with:
```
$ nix develop
$ cd build
$ meson test perlcritic
$ nix-shell
$ make perlcritic
```
### JSON API

View File

@@ -11,6 +11,12 @@ $ cd hydra
To enter a shell in which all environment variables (such as `PERL5LIB`)
and dependencies can be found:
```console
$ nix-shell
```
of when flakes are enabled:
```console
$ nix develop
```
@@ -18,15 +24,15 @@ $ nix develop
To build Hydra, you should then do:
```console
$ mesonConfigurePhase
$ ninja
[nix-shell]$ autoreconfPhase
[nix-shell]$ configurePhase
[nix-shell]$ make -j$(nproc)
```
You start a local database, the webserver, and other components with
foreman:
```console
$ ninja -C build
$ foreman start
```
@@ -41,11 +47,18 @@ $ ./src/script/hydra-server
You can run Hydra's test suite with the following:
```console
$ meson test
# to run as many tests as you have cores:
$ YATH_JOB_COUNT=$NIX_BUILD_CORES meson test
[nix-shell]$ make check
[nix-shell]$ # to run as many tests as you have cores:
[nix-shell]$ make check YATH_JOB_COUNT=$NIX_BUILD_CORES
[nix-shell]$ # or run yath directly:
[nix-shell]$ yath test
[nix-shell]$ # to run as many tests as you have cores:
[nix-shell]$ yath test -j $NIX_BUILD_CORES
```
When using `yath` instead of `make check`, ensure you have run `make`
in the root of the repository at least once.
**Warning**: Currently, the tests can fail
if run with high parallelism [due to an issue in
`Test::PostgreSQL`](https://github.com/TJC/Test-postgresql/issues/40)
@@ -62,7 +75,7 @@ will reload the page every time you save.
To build Hydra and its dependencies:
```console
$ nix build .#packages.x86_64-linux.default
$ nix-build release.nix -A build.x86_64-linux
```
## Development Tasks

95
flake.lock generated
View File

@@ -1,10 +1,51 @@
{
"nodes": {
"flake-parts": {
"inputs": {
"nixpkgs-lib": [
"nix-eval-jobs",
"nixpkgs"
]
},
"locked": {
"lastModified": 1722555600,
"narHash": "sha256-XOQkdLafnb/p9ij77byFQjDf5m5QYl9b2REiVClC+x4=",
"owner": "hercules-ci",
"repo": "flake-parts",
"rev": "8471fe90ad337a8074e957b69ca4d0089218391d",
"type": "github"
},
"original": {
"owner": "hercules-ci",
"repo": "flake-parts",
"type": "github"
}
},
"libgit2": {
"flake": false,
"locked": {
"lastModified": 1715853528,
"narHash": "sha256-J2rCxTecyLbbDdsyBWn9w7r3pbKRMkI9E7RvRgAqBdY=",
"owner": "libgit2",
"repo": "libgit2",
"rev": "36f7e21ad757a3dacc58cf7944329da6bc1d6e96",
"type": "github"
},
"original": {
"owner": "libgit2",
"ref": "v1.8.1",
"repo": "libgit2",
"type": "github"
}
},
"nix": {
"inputs": {
"flake-compat": [],
"flake-parts": [],
"git-hooks-nix": [],
"libgit2": [
"libgit2"
],
"nixpkgs": [
"nixpkgs"
],
@@ -12,58 +53,88 @@
"nixpkgs-regression": []
},
"locked": {
"lastModified": 1744030329,
"narHash": "sha256-r+psCOW77vTSTNbxTVrYHeh6OgB0QukbnyUVDwg8s4I=",
"lastModified": 1726787955,
"narHash": "sha256-XFznzb8L4SdUm9u+w3DPpMWJhffuv+/6+aiVl00slns=",
"owner": "NixOS",
"repo": "nix",
"rev": "a4962f73b5fc874d4b16baef47921daf349addfc",
"rev": "a7fdef6858dd45b9d7bda7c92324c63faee7f509",
"type": "github"
},
"original": {
"owner": "NixOS",
"ref": "2.28-maintenance",
"ref": "2.24-maintenance",
"repo": "nix",
"type": "github"
}
},
"nix-eval-jobs": {
"flake": false,
"inputs": {
"flake-parts": "flake-parts",
"nix-github-actions": [],
"nixpkgs": [
"nixpkgs"
],
"treefmt-nix": "treefmt-nix"
},
"locked": {
"lastModified": 1744018595,
"narHash": "sha256-v5n6t49X7MOpqS9j0FtI6TWOXvxuZMmGsp2OfUK5QfA=",
"lastModified": 1733814344,
"narHash": "sha256-3wwtKpS5tUBdjaGeSia7CotonbiRB6K5Kp0dsUt3nzU=",
"owner": "nix-community",
"repo": "nix-eval-jobs",
"rev": "cba718bafe5dc1607c2b6761ecf53c641a6f3b21",
"rev": "889ea1406736b53cf165b6c28398aae3969418d1",
"type": "github"
},
"original": {
"owner": "nix-community",
"ref": "release-2.24",
"repo": "nix-eval-jobs",
"type": "github"
}
},
"nixpkgs": {
"locked": {
"lastModified": 1743987495,
"narHash": "sha256-46T2vMZ4/AfCK0Y2OjlFzJPxmdpP8GtsuEqSSJv3oe4=",
"lastModified": 1726688310,
"narHash": "sha256-Xc9lEtentPCEtxc/F1e6jIZsd4MPDYv4Kugl9WtXlz0=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "db8f4fe18ce772a9c8f3adf321416981c8fe9371",
"rev": "dbebdd67a6006bb145d98c8debf9140ac7e651d0",
"type": "github"
},
"original": {
"owner": "NixOS",
"ref": "nixos-24.11-small",
"ref": "nixos-24.05-small",
"repo": "nixpkgs",
"type": "github"
}
},
"root": {
"inputs": {
"libgit2": "libgit2",
"nix": "nix",
"nix-eval-jobs": "nix-eval-jobs",
"nixpkgs": "nixpkgs"
}
},
"treefmt-nix": {
"inputs": {
"nixpkgs": [
"nix-eval-jobs",
"nixpkgs"
]
},
"locked": {
"lastModified": 1723303070,
"narHash": "sha256-krGNVA30yptyRonohQ+i9cnK+CfCpedg6z3qzqVJcTs=",
"owner": "numtide",
"repo": "treefmt-nix",
"rev": "14c092e0326de759e16b37535161b3cb9770cea3",
"type": "github"
},
"original": {
"owner": "numtide",
"repo": "treefmt-nix",
"type": "github"
}
}
},
"root": "root",

View File

@@ -1,25 +1,25 @@
{
description = "A Nix-based continuous build system";
inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11-small";
inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.05-small";
inputs.nix = {
url = "github:NixOS/nix/2.28-maintenance";
inputs.nixpkgs.follows = "nixpkgs";
inputs.libgit2 = { url = "github:libgit2/libgit2/v1.8.1"; flake = false; };
inputs.nix.url = "github:NixOS/nix/2.24-maintenance";
inputs.nix.inputs.nixpkgs.follows = "nixpkgs";
inputs.nix.inputs.libgit2.follows = "libgit2";
# hide nix dev tooling from our lock file
inputs.flake-parts.follows = "";
inputs.git-hooks-nix.follows = "";
inputs.nixpkgs-regression.follows = "";
inputs.nixpkgs-23-11.follows = "";
inputs.flake-compat.follows = "";
};
inputs.nix-eval-jobs.url = "github:nix-community/nix-eval-jobs/release-2.24";
inputs.nix-eval-jobs.inputs.nixpkgs.follows = "nixpkgs";
inputs.nix-eval-jobs = {
url = "github:nix-community/nix-eval-jobs";
# We want to control the deps precisely
flake = false;
};
# hide nix dev tooling from our lock file
inputs.nix.inputs.flake-parts.follows = "";
inputs.nix.inputs.git-hooks-nix.follows = "";
inputs.nix.inputs.nixpkgs-regression.follows = "";
inputs.nix.inputs.nixpkgs-23-11.follows = "";
inputs.nix.inputs.flake-compat.follows = "";
# hide nix-eval-jobs dev tooling from our lock file
inputs.nix-eval-jobs.inputs.nix-github-actions.follows = "";
outputs = { self, nixpkgs, nix, nix-eval-jobs, ... }:
let
@@ -30,10 +30,11 @@
# A Nixpkgs overlay that provides a 'hydra' package.
overlays.default = final: prev: {
nix-eval-jobs = final.callPackage nix-eval-jobs {};
hydra = final.callPackage ./package.nix {
inherit (nixpkgs.lib) fileset;
nix-eval-jobs = nix-eval-jobs.packages.${final.system}.default;
rawSrc = self;
nix-perl-bindings = final.nixComponents.nix-perl-bindings;
};
};
@@ -72,29 +73,13 @@
validate-openapi = hydraJobs.tests.validate-openapi.${system};
});
packages = forEachSystem (system: let
nixComponents = {
inherit (nix.packages.${system})
nix-util
nix-store
nix-expr
nix-fetchers
nix-flake
nix-main
nix-cmd
nix-cli
nix-perl-bindings
;
};
in {
nix-eval-jobs = nixpkgs.legacyPackages.${system}.callPackage nix-eval-jobs {
inherit nixComponents;
};
packages = forEachSystem (system: {
hydra = nixpkgs.legacyPackages.${system}.callPackage ./package.nix {
inherit (nixpkgs.lib) fileset;
inherit nixComponents;
inherit (self.packages.${system}) nix-eval-jobs;
nix-eval-jobs = nix-eval-jobs.packages.${system}.default;
rawSrc = self;
nix = nix.packages.${system}.nix;
nix-perl-bindings = nix.hydraJobs.perlBindings.${system};
};
default = self.packages.${system}.hydra;
});

View File

@@ -8,9 +8,23 @@ project('hydra', 'cpp',
],
)
nix_util_dep = dependency('nix-util', required: true)
nix_store_dep = dependency('nix-store', required: true)
nix_main_dep = dependency('nix-main', required: true)
nix_expr_dep = dependency('nix-expr', required: true)
nix_flake_dep = dependency('nix-flake', required: true)
nix_cmd_dep = dependency('nix-cmd', required: true)
# Nix need extra flags not provided in its pkg-config files.
nix_dep = declare_dependency(
dependencies: [
nix_store_dep,
nix_main_dep,
nix_expr_dep,
nix_flake_dep,
nix_cmd_dep,
],
compile_args: ['-include', 'nix/config.h'],
)
pqxx_dep = dependency('libpqxx', required: true)

View File

@@ -15,6 +15,7 @@
systemd.services.hydra-send-stats.enable = false;
services.postgresql.enable = true;
services.postgresql.package = pkgs.postgresql_12;
# The following is to work around the following error from hydra-server:
# [error] Caught exception in engine "Cannot determine local time zone"

View File

@@ -468,7 +468,7 @@ in
elif [[ $compression == zstd ]]; then
compression="zstd --rm"
fi
find ${baseDir}/build-logs -ignore_readdir_race -type f -name "*.drv" -mtime +3 -size +0c | xargs -r "$compression" --force --quiet
find ${baseDir}/build-logs -type f -name "*.drv" -mtime +3 -size +0c | xargs -r "$compression" --force --quiet
'';
startAt = "Sun 01:45";
};

View File

@@ -27,7 +27,8 @@ in
{
install = forEachSystem (system:
(import (nixpkgs + "/nixos/lib/testing-python.nix") { inherit system; }).simpleTest {
with import (nixpkgs + "/nixos/lib/testing-python.nix") { inherit system; };
simpleTest {
name = "hydra-install";
nodes.machine = hydraServer;
testScript =
@@ -42,7 +43,8 @@ in
});
notifications = forEachSystem (system:
(import (nixpkgs + "/nixos/lib/testing-python.nix") { inherit system; }).simpleTest {
with import (nixpkgs + "/nixos/lib/testing-python.nix") { inherit system; };
simpleTest {
name = "hydra-notifications";
nodes.machine = {
imports = [ hydraServer ];
@@ -54,7 +56,7 @@ in
'';
services.influxdb.enable = true;
};
testScript = { nodes, ... }: ''
testScript = ''
machine.wait_for_job("hydra-init")
# Create an admin account and some other state.
@@ -85,7 +87,7 @@ in
# Setup the project and jobset
machine.succeed(
"su - hydra -c 'perl -I ${nodes.machine.services.hydra-dev.package.perlDeps}/lib/perl5/site_perl ${./t/setup-notifications-jobset.pl}' >&2"
"su - hydra -c 'perl -I ${config.services.hydra-dev.package.perlDeps}/lib/perl5/site_perl ${./t/setup-notifications-jobset.pl}' >&2"
)
# Wait until hydra has build the job and
@@ -99,10 +101,9 @@ in
});
gitea = forEachSystem (system:
let
pkgs = nixpkgs.legacyPackages.${system};
in
(import (nixpkgs + "/nixos/lib/testing-python.nix") { inherit system; }).makeTest {
let pkgs = nixpkgs.legacyPackages.${system}; in
with import (nixpkgs + "/nixos/lib/testing-python.nix") { inherit system; };
makeTest {
name = "hydra-gitea";
nodes.machine = { pkgs, ... }: {
imports = [ hydraServer ];

View File

@@ -8,7 +8,8 @@
, perlPackages
, nixComponents
, nix
, nix-perl-bindings
, git
, makeWrapper
@@ -61,7 +62,7 @@ let
name = "hydra-perl-deps";
paths = lib.closePropagation
([
nixComponents.nix-perl-bindings
nix-perl-bindings
git
] ++ (with perlPackages; [
AuthenSASL
@@ -162,7 +163,7 @@ stdenv.mkDerivation (finalAttrs: {
nukeReferences
pkg-config
mdbook
nixComponents.nix-cli
nix
perlDeps
perl
unzip
@@ -172,9 +173,7 @@ stdenv.mkDerivation (finalAttrs: {
libpqxx
openssl
libxslt
nixComponents.nix-util
nixComponents.nix-store
nixComponents.nix-main
nix
perlDeps
perl
boost
@@ -201,14 +200,13 @@ stdenv.mkDerivation (finalAttrs: {
glibcLocales
libressl.nc
python3
nixComponents.nix-cli
];
hydraPath = lib.makeBinPath (
[
subversion
openssh
nixComponents.nix-cli
nix
coreutils
findutils
pixz
@@ -238,7 +236,7 @@ stdenv.mkDerivation (finalAttrs: {
shellHook = ''
pushd $(git rev-parse --show-toplevel) >/dev/null
PATH=$(pwd)/build/src/hydra-evaluator:$(pwd)/build/src/script:$(pwd)/build/src/hydra-queue-runner:$PATH
PATH=$(pwd)/src/hydra-evaluator:$(pwd)/src/script:$(pwd)/src/hydra-queue-runner:$PATH
PERL5LIB=$(pwd)/src/lib:$PERL5LIB
export HYDRA_HOME="$(pwd)/src/"
mkdir -p .hydra-data
@@ -269,7 +267,7 @@ stdenv.mkDerivation (finalAttrs: {
--prefix PATH ':' $out/bin:$hydraPath \
--set HYDRA_RELEASE ${version} \
--set HYDRA_HOME $out/libexec/hydra \
--set NIX_RELEASE ${nixComponents.nix-cli.name or "unknown"} \
--set NIX_RELEASE ${nix.name or "unknown"} \
--set NIX_EVAL_JOBS_RELEASE ${nix-eval-jobs.name or "unknown"}
done
'';
@@ -277,5 +275,5 @@ stdenv.mkDerivation (finalAttrs: {
dontStrip = true;
meta.description = "Build of Hydra on ${stdenv.system}";
passthru = { inherit perlDeps; };
passthru = { inherit perlDeps nix; };
})

View File

@@ -1,8 +1,8 @@
#include "db.hh"
#include "hydra-config.hh"
#include <nix/util/pool.hh>
#include <nix/main/shared.hh>
#include <nix/util/signals.hh>
#include "pool.hh"
#include "shared.hh"
#include "signals.hh"
#include <algorithm>
#include <thread>

View File

@@ -2,8 +2,7 @@ hydra_evaluator = executable('hydra-evaluator',
'hydra-evaluator.cc',
dependencies: [
libhydra_dep,
nix_util_dep,
nix_main_dep,
nix_dep,
pqxx_dep,
],
install: true,

View File

@@ -5,40 +5,44 @@
#include <sys/stat.h>
#include <fcntl.h>
#include <nix/store/build-result.hh>
#include <nix/store/path.hh>
#include <nix/store/legacy-ssh-store.hh>
#include <nix/store/serve-protocol.hh>
#include <nix/store/serve-protocol-impl.hh>
#include "build-result.hh"
#include "path.hh"
#include "serve-protocol.hh"
#include "serve-protocol-impl.hh"
#include "state.hh"
#include <nix/util/current-process.hh>
#include <nix/util/processes.hh>
#include <nix/util/util.hh>
#include <nix/store/serve-protocol.hh>
#include <nix/store/serve-protocol-impl.hh>
#include <nix/store/ssh.hh>
#include <nix/util/finally.hh>
#include <nix/util/url.hh>
#include "current-process.hh"
#include "processes.hh"
#include "util.hh"
#include "serve-protocol.hh"
#include "serve-protocol-impl.hh"
#include "ssh.hh"
#include "finally.hh"
#include "url.hh"
using namespace nix;
bool ::Machine::isLocalhost() const
{
return storeUri.params.empty() && std::visit(overloaded {
[](const StoreReference::Auto &) {
return true;
},
[](const StoreReference::Specified & s) {
return
(s.scheme == "local" || s.scheme == "unix") ||
((s.scheme == "ssh" || s.scheme == "ssh-ng") &&
s.authority == "localhost");
},
}, storeUri.variant);
}
namespace nix::build_remote {
static Strings extraStoreArgs(std::string & machine)
{
Strings result;
try {
auto parsed = parseURL(machine);
if (parsed.scheme != "ssh") {
throw SysError("Currently, only (legacy-)ssh stores are supported!");
}
machine = parsed.authority.value_or("");
auto remoteStore = parsed.query.find("remote-store");
if (remoteStore != parsed.query.end()) {
result = {"--store", shellEscape(remoteStore->second)};
}
} catch (BadURL &) {
// We just try to continue with `machine->sshName` here for backwards compat.
}
return result;
}
static std::unique_ptr<SSHMaster::Connection> openConnection(
::Machine::ptr machine, SSHMaster & master)
{
@@ -47,11 +51,7 @@ static std::unique_ptr<SSHMaster::Connection> openConnection(
command.push_back("--builders");
command.push_back("");
} else {
auto remoteStore = machine->storeUri.params.find("remote-store");
if (remoteStore != machine->storeUri.params.end()) {
command.push_back("--store");
command.push_back(shellEscape(remoteStore->second));
}
command.splice(command.end(), extraStoreArgs(machine->sshName));
}
auto ret = master.startCommand(std::move(command), {
@@ -198,7 +198,7 @@ static BasicDerivation sendInputs(
MaintainCount<counter> mc2(nrStepsCopyingTo);
printMsg(lvlDebug, "sending closure of %s to %s",
localStore.printStorePath(step.drvPath), conn.machine->storeUri.render());
localStore.printStorePath(step.drvPath), conn.machine->sshName);
auto now1 = std::chrono::steady_clock::now();
@@ -278,6 +278,32 @@ static BuildResult performBuild(
return result;
}
static std::map<StorePath, UnkeyedValidPathInfo> queryPathInfos(
::Machine::Connection & conn,
Store & localStore,
StorePathSet & outputs,
size_t & totalNarSize
)
{
/* Get info about each output path. */
std::map<StorePath, UnkeyedValidPathInfo> infos;
conn.to << ServeProto::Command::QueryPathInfos;
ServeProto::write(localStore, conn, outputs);
conn.to.flush();
while (true) {
auto storePathS = readString(conn.from);
if (storePathS == "") break;
auto storePath = localStore.parseStorePath(storePathS);
auto info = ServeProto::Serialise<UnkeyedValidPathInfo>::read(localStore, conn);
totalNarSize += info.narSize;
infos.insert_or_assign(std::move(storePath), std::move(info));
}
return infos;
}
static void copyPathFromRemote(
::Machine::Connection & conn,
NarMemberDatas & narMembers,
@@ -398,7 +424,7 @@ private:
};
void State::buildRemote(ref<Store> destStore,
std::unique_ptr<MachineReservation> reservation,
MachineReservation::ptr & reservation,
::Machine::ptr machine, Step::ptr step,
const ServeProto::BuildOptions & buildOptions,
RemoteResult & result, std::shared_ptr<ActiveStep> activeStep,
@@ -415,22 +441,14 @@ void State::buildRemote(ref<Store> destStore,
updateStep(ssConnecting);
auto storeRef = machine->completeStoreReference();
auto * pSpecified = std::get_if<StoreReference::Specified>(&storeRef.variant);
if (!pSpecified || pSpecified->scheme != "ssh") {
throw Error("Currently, only (legacy-)ssh stores are supported!");
}
LegacySSHStoreConfig storeConfig {
pSpecified->scheme,
pSpecified->authority,
storeRef.params
};
auto master = storeConfig.createSSHMaster(
SSHMaster master {
machine->sshName,
machine->sshKey,
machine->sshPublicHostKey,
false, // no SSH master yet
logFD.get());
false, // no compression yet
logFD.get(),
};
// FIXME: rewrite to use Store.
auto child = build_remote::openConnection(machine, master);
@@ -475,13 +493,19 @@ void State::buildRemote(ref<Store> destStore,
conn.to,
conn.from,
our_version,
machine->storeUri.render());
machine->sshName);
} catch (EndOfFile & e) {
child->sshPid.wait();
std::string s = chomp(readFile(result.logFile));
throw Error("cannot connect to %1%: %2%", machine->storeUri.render(), s);
throw Error("cannot connect to %1%: %2%", machine->sshName, s);
}
// Do not attempt to speak a newer version of the protocol.
//
// Per https://github.com/NixOS/nix/issues/9584 should be handled as
// part of `handshake` in upstream nix.
conn.remoteVersion = std::min(conn.remoteVersion, our_version);
{
auto info(machine->state->connectInfo.lock());
info->consecutiveFailures = 0;
@@ -510,7 +534,7 @@ void State::buildRemote(ref<Store> destStore,
/* Do the build. */
printMsg(lvlDebug, "building %s on %s",
localStore->printStorePath(step->drvPath),
machine->storeUri.render());
machine->sshName);
updateStep(ssBuilding);
@@ -533,7 +557,7 @@ void State::buildRemote(ref<Store> destStore,
get a build log. */
if (result.isCached) {
printMsg(lvlInfo, "outputs of %s substituted or already valid on %s",
localStore->printStorePath(step->drvPath), machine->storeUri.render());
localStore->printStorePath(step->drvPath), machine->sshName);
unlink(result.logFile.c_str());
result.logFile = "";
}
@@ -552,7 +576,8 @@ void State::buildRemote(ref<Store> destStore,
* to avoid situations where the queue runner is bottlenecked on
* copying outputs and we end up building too many things that we
* haven't been able to allow copy slots for. */
reservation.reset();
assert(reservation.unique());
reservation = 0;
wakeDispatcher();
StorePathSet outputs;
@@ -567,10 +592,8 @@ void State::buildRemote(ref<Store> destStore,
auto now1 = std::chrono::steady_clock::now();
auto infos = conn.queryPathInfos(*localStore, outputs);
size_t totalNarSize = 0;
for (auto & [_, info] : infos) totalNarSize += info.narSize;
auto infos = build_remote::queryPathInfos(conn, *localStore, outputs, totalNarSize);
if (totalNarSize > maxOutputSize) {
result.stepStatus = bsNarSizeLimitExceeded;
@@ -579,7 +602,7 @@ void State::buildRemote(ref<Store> destStore,
/* Copy each path. */
printMsg(lvlDebug, "copying outputs of %s from %s (%d bytes)",
localStore->printStorePath(step->drvPath), machine->storeUri.render(), totalNarSize);
localStore->printStorePath(step->drvPath), machine->sshName, totalNarSize);
build_remote::copyPathsFromRemote(conn, narMembers, *localStore, *destStore, infos);
auto now2 = std::chrono::steady_clock::now();
@@ -618,7 +641,7 @@ void State::buildRemote(ref<Store> destStore,
info->consecutiveFailures = std::min(info->consecutiveFailures + 1, (unsigned int) 4);
info->lastFailure = now;
int delta = retryInterval * std::pow(retryBackoff, info->consecutiveFailures - 1) + (rand() % 30);
printMsg(lvlInfo, "will disable machine %1% for %2%s", machine->storeUri.render(), delta);
printMsg(lvlInfo, "will disable machine %1% for %2%s", machine->sshName, delta);
info->disabledUntil = now + std::chrono::seconds(delta);
}
throw;

View File

@@ -1,7 +1,7 @@
#include "hydra-build-result.hh"
#include <nix/store/store-api.hh>
#include <nix/util/util.hh>
#include <nix/util/source-accessor.hh>
#include "store-api.hh"
#include "util.hh"
#include "source-accessor.hh"
#include <regex>

View File

@@ -2,8 +2,8 @@
#include "state.hh"
#include "hydra-build-result.hh"
#include <nix/util/finally.hh>
#include <nix/store/binary-cache-store.hh>
#include "finally.hh"
#include "binary-cache-store.hh"
using namespace nix;
@@ -16,7 +16,7 @@ void setThreadName(const std::string & name)
}
void State::builder(std::unique_ptr<MachineReservation> reservation)
void State::builder(MachineReservation::ptr reservation)
{
setThreadName("bld~" + std::string(reservation->step->drvPath.to_string()));
@@ -35,20 +35,25 @@ void State::builder(std::unique_ptr<MachineReservation> reservation)
activeSteps_.lock()->erase(activeStep);
});
std::string machine = reservation->machine->storeUri.render();
try {
auto destStore = getDestStore();
// Might release the reservation.
res = doBuildStep(destStore, std::move(reservation), activeStep);
res = doBuildStep(destStore, reservation, activeStep);
} catch (std::exception & e) {
printMsg(lvlError, "uncaught exception building %s on %s: %s",
localStore->printStorePath(activeStep->step->drvPath),
machine,
reservation ? reservation->machine->sshName : std::string("(no machine)"),
e.what());
}
}
/* If the machine hasn't been released yet, release and wake up the dispatcher. */
if (reservation) {
assert(reservation.unique());
reservation = 0;
wakeDispatcher();
}
/* If there was a temporary failure, retry the step after an
exponentially increasing interval. */
Step::ptr step = wstep.lock();
@@ -70,7 +75,7 @@ void State::builder(std::unique_ptr<MachineReservation> reservation)
State::StepResult State::doBuildStep(nix::ref<Store> destStore,
std::unique_ptr<MachineReservation> reservation,
MachineReservation::ptr & reservation,
std::shared_ptr<ActiveStep> activeStep)
{
auto step(reservation->step);
@@ -148,7 +153,7 @@ State::StepResult State::doBuildStep(nix::ref<Store> destStore,
buildOptions.buildTimeout = build->buildTimeout;
printInfo("performing step %s %d times on %s (needed by build %d and %d others)",
localStore->printStorePath(step->drvPath), buildOptions.nrRepeats + 1, machine->storeUri.render(), buildId, (dependents.size() - 1));
localStore->printStorePath(step->drvPath), buildOptions.nrRepeats + 1, machine->sshName, buildId, (dependents.size() - 1));
}
if (!buildOneDone)
@@ -176,7 +181,7 @@ State::StepResult State::doBuildStep(nix::ref<Store> destStore,
unlink(result.logFile.c_str());
}
} catch (...) {
ignoreExceptionInDestructor();
ignoreException();
}
}
});
@@ -194,7 +199,7 @@ State::StepResult State::doBuildStep(nix::ref<Store> destStore,
{
auto mc = startDbUpdate();
pqxx::work txn(*conn);
stepNr = createBuildStep(txn, result.startTime, buildId, step, machine->storeUri.render(), bsBusy);
stepNr = createBuildStep(txn, result.startTime, buildId, step, machine->sshName, bsBusy);
txn.commit();
}
@@ -209,7 +214,7 @@ State::StepResult State::doBuildStep(nix::ref<Store> destStore,
try {
/* FIXME: referring builds may have conflicting timeouts. */
buildRemote(destStore, std::move(reservation), machine, step, buildOptions, result, activeStep, updateStep, narMembers);
buildRemote(destStore, reservation, machine, step, buildOptions, result, activeStep, updateStep, narMembers);
} catch (Error & e) {
if (activeStep->state_.lock()->cancelled) {
printInfo("marking step %d of build %d as cancelled", stepNr, buildId);
@@ -251,7 +256,7 @@ State::StepResult State::doBuildStep(nix::ref<Store> destStore,
/* Finish the step in the database. */
if (stepNr) {
pqxx::work txn(*conn);
finishBuildStep(txn, result, buildId, stepNr, machine->storeUri.render());
finishBuildStep(txn, result, buildId, stepNr, machine->sshName);
txn.commit();
}
@@ -259,7 +264,7 @@ State::StepResult State::doBuildStep(nix::ref<Store> destStore,
issue). Retry a number of times. */
if (result.canRetry) {
printMsg(lvlError, "possibly transient failure building %s on %s: %s",
localStore->printStorePath(step->drvPath), machine->storeUri.render(), result.errorMsg);
localStore->printStorePath(step->drvPath), machine->sshName, result.errorMsg);
assert(stepNr);
bool retry;
{
@@ -450,7 +455,7 @@ void State::failStep(
build->finishedInDB)
continue;
createBuildStep(txn,
0, build->id, step, machine ? machine->storeUri.render() : "",
0, build->id, step, machine ? machine->sshName : "",
result.stepStatus, result.errorMsg, buildId == build->id ? 0 : buildId);
}

View File

@@ -262,7 +262,7 @@ system_time State::doDispatch()
/* Can this machine do this step? */
if (!mi.machine->supportsStep(step)) {
debug("machine '%s' does not support step '%s' (system type '%s')",
mi.machine->storeUri.render(), localStore->printStorePath(step->drvPath), step->drv->platform);
mi.machine->sshName, localStore->printStorePath(step->drvPath), step->drv->platform);
continue;
}
@@ -288,7 +288,7 @@ system_time State::doDispatch()
/* Make a slot reservation and start a thread to
do the build. */
auto builderThread = std::thread(&State::builder, this,
std::make_unique<MachineReservation>(*this, step, mi.machine));
std::make_shared<MachineReservation>(*this, step, mi.machine));
builderThread.detach(); // FIXME?
keepGoing = true;

View File

@@ -2,9 +2,9 @@
#include <memory>
#include <nix/util/hash.hh>
#include <nix/store/derivations.hh>
#include <nix/store/store-api.hh>
#include "hash.hh"
#include "derivations.hh"
#include "store-api.hh"
#include "nar-extractor.hh"
struct BuildProduct

View File

@@ -11,16 +11,16 @@
#include <nlohmann/json.hpp>
#include <nix/util/signals.hh>
#include "signals.hh"
#include "state.hh"
#include "hydra-build-result.hh"
#include <nix/store/store-api.hh>
#include <nix/store/remote-store.hh>
#include "store-api.hh"
#include "remote-store.hh"
#include <nix/store/globals.hh>
#include "globals.hh"
#include "hydra-config.hh"
#include <nix/store/s3-binary-cache-store.hh>
#include <nix/main/shared.hh>
#include "s3-binary-cache-store.hh"
#include "shared.hh"
using namespace nix;
using nlohmann::json;
@@ -98,34 +98,6 @@ State::PromMetrics::PromMetrics()
.Register(*registry)
.Add({})
)
, dispatcher_time_spent_running(
prometheus::BuildCounter()
.Name("hydraqueuerunner_dispatcher_time_spent_running")
.Help("Time (in micros) spent running the dispatcher")
.Register(*registry)
.Add({})
)
, dispatcher_time_spent_waiting(
prometheus::BuildCounter()
.Name("hydraqueuerunner_dispatcher_time_spent_waiting")
.Help("Time (in micros) spent waiting for the dispatcher to obtain work")
.Register(*registry)
.Add({})
)
, queue_monitor_time_spent_running(
prometheus::BuildCounter()
.Name("hydraqueuerunner_queue_monitor_time_spent_running")
.Help("Time (in micros) spent running the queue monitor")
.Register(*registry)
.Add({})
)
, queue_monitor_time_spent_waiting(
prometheus::BuildCounter()
.Name("hydraqueuerunner_queue_monitor_time_spent_waiting")
.Help("Time (in micros) spent waiting for the queue monitor to obtain work")
.Register(*registry)
.Add({})
)
{
}
@@ -185,26 +157,65 @@ void State::parseMachines(const std::string & contents)
oldMachines = *machines_;
}
for (auto && machine_ : nix::Machine::parseConfig({}, contents)) {
auto machine = std::make_shared<::Machine>(std::move(machine_));
for (auto line : tokenizeString<Strings>(contents, "\n")) {
line = trim(std::string(line, 0, line.find('#')));
auto tokens = tokenizeString<std::vector<std::string>>(line);
if (tokens.size() < 3) continue;
tokens.resize(8);
if (tokens[5] == "-") tokens[5] = "";
auto supportedFeatures = tokenizeString<StringSet>(tokens[5], ",");
if (tokens[6] == "-") tokens[6] = "";
auto mandatoryFeatures = tokenizeString<StringSet>(tokens[6], ",");
for (auto & f : mandatoryFeatures)
supportedFeatures.insert(f);
using MaxJobs = std::remove_const<decltype(nix::Machine::maxJobs)>::type;
auto machine = std::make_shared<::Machine>(nix::Machine {
// `storeUri`, not yet used
"",
// `systemTypes`
tokenizeString<StringSet>(tokens[1], ","),
// `sshKey`
tokens[2] == "-" ? "" : tokens[2],
// `maxJobs`
tokens[3] != ""
? string2Int<MaxJobs>(tokens[3]).value()
: 1,
// `speedFactor`
std::stof(tokens[4].c_str()),
// `supportedFeatures`
std::move(supportedFeatures),
// `mandatoryFeatures`
std::move(mandatoryFeatures),
// `sshPublicHostKey`
tokens[7] != "" && tokens[7] != "-"
? tokens[7]
: "",
});
machine->sshName = tokens[0];
/* Re-use the State object of the previous machine with the
same name. */
auto i = oldMachines.find(machine->storeUri.variant);
auto i = oldMachines.find(machine->sshName);
if (i == oldMachines.end())
printMsg(lvlChatty, "adding new machine %1%", machine->storeUri.render());
printMsg(lvlChatty, "adding new machine %1%", machine->sshName);
else
printMsg(lvlChatty, "updating machine %1%", machine->storeUri.render());
printMsg(lvlChatty, "updating machine %1%", machine->sshName);
machine->state = i == oldMachines.end()
? std::make_shared<::Machine::State>()
: i->second->state;
newMachines[machine->storeUri.variant] = machine;
newMachines[machine->sshName] = machine;
}
for (auto & m : oldMachines)
if (newMachines.find(m.first) == newMachines.end()) {
if (m.second->enabled)
printInfo("removing machine %1%", m.second->storeUri.render());
printInfo("removing machine %1%", m.first);
/* Add a disabled ::Machine object to make sure stats are
maintained. */
auto machine = std::make_shared<::Machine>(*(m.second));
@@ -643,7 +654,6 @@ void State::dumpStatus(Connection & conn)
}
{
auto machines_json = json::object();
auto machines_(machines.lock());
for (auto & i : *machines_) {
auto & m(i.second);
@@ -670,9 +680,8 @@ void State::dumpStatus(Connection & conn)
machine["avgStepTime"] = (float) s->totalStepTime / s->nrStepsDone;
machine["avgStepBuildTime"] = (float) s->totalStepBuildTime / s->nrStepsDone;
}
machines_json[m->storeUri.render()] = machine;
statusJson["machines"][m->sshName] = machine;
}
statusJson["machines"] = machines_json;
}
{
@@ -731,7 +740,6 @@ void State::dumpStatus(Connection & conn)
: 0.0},
};
#if NIX_WITH_S3_SUPPORT
auto s3Store = dynamic_cast<S3BinaryCacheStore *>(&*store);
if (s3Store) {
auto & s3Stats = s3Store->getS3Stats();
@@ -757,7 +765,6 @@ void State::dumpStatus(Connection & conn)
+ s3Stats.getBytes / (1024.0 * 1024.0 * 1024.0) * 0.09},
};
}
#endif
}
{

View File

@@ -13,9 +13,7 @@ hydra_queue_runner = executable('hydra-queue-runner',
srcs,
dependencies: [
libhydra_dep,
nix_util_dep,
nix_store_dep,
nix_main_dep,
nix_dep,
pqxx_dep,
prom_cpp_core_dep,
prom_cpp_pull_dep,

View File

@@ -1,6 +1,6 @@
#include "nar-extractor.hh"
#include <nix/util/archive.hh>
#include "archive.hh"
#include <unordered_set>

View File

@@ -1,9 +1,9 @@
#pragma once
#include <nix/util/source-accessor.hh>
#include <nix/util/types.hh>
#include <nix/util/serialise.hh>
#include <nix/util/hash.hh>
#include "source-accessor.hh"
#include "types.hh"
#include "serialise.hh"
#include "hash.hh"
struct NarMemberData
{

View File

@@ -1,8 +1,7 @@
#include "state.hh"
#include "hydra-build-result.hh"
#include <nix/store/globals.hh>
#include <nix/store/parsed-derivations.hh>
#include <nix/util/thread-pool.hh>
#include "globals.hh"
#include "thread-pool.hh"
#include <cstring>
@@ -491,17 +490,14 @@ Step::ptr State::createStep(ref<Store> destStore,
it's not runnable yet, and other threads won't make it
runnable while step->created == false. */
step->drv = std::make_unique<Derivation>(localStore->readDerivation(drvPath));
{
auto parsedDrv = ParsedDerivation{drvPath, *step->drv};
step->drvOptions = std::make_unique<DerivationOptions>(DerivationOptions::fromParsedDerivation(parsedDrv));
}
step->parsedDrv = std::make_unique<ParsedDerivation>(drvPath, *step->drv);
step->preferLocalBuild = step->drvOptions->willBuildLocally(*localStore, *step->drv);
step->preferLocalBuild = step->parsedDrv->willBuildLocally(*localStore);
step->isDeterministic = getOr(step->drv->env, "isDetermistic", "0") == "1";
step->systemType = step->drv->platform;
{
StringSet features = step->requiredSystemFeatures = step->drvOptions->getRequiredSystemFeatures(*step->drv);
StringSet features = step->requiredSystemFeatures = step->parsedDrv->getRequiredSystemFeatures();
if (step->preferLocalBuild)
features.insert("local");
if (!features.empty()) {

View File

@@ -15,18 +15,17 @@
#include "db.hh"
#include <nix/store/derivations.hh>
#include <nix/store/derivation-options.hh>
#include <nix/store/pathlocks.hh>
#include <nix/util/pool.hh>
#include <nix/store/build-result.hh>
#include <nix/store/store-api.hh>
#include <nix/util/sync.hh>
#include "parsed-derivations.hh"
#include "pathlocks.hh"
#include "pool.hh"
#include "build-result.hh"
#include "store-api.hh"
#include "sync.hh"
#include "nar-extractor.hh"
#include <nix/store/serve-protocol.hh>
#include <nix/store/serve-protocol-impl.hh>
#include <nix/store/serve-protocol-connection.hh>
#include <nix/store/machines.hh>
#include "serve-protocol.hh"
#include "serve-protocol-impl.hh"
#include "serve-protocol-connection.hh"
#include "machines.hh"
typedef unsigned int BuildID;
@@ -171,7 +170,7 @@ struct Step
nix::StorePath drvPath;
std::unique_ptr<nix::Derivation> drv;
std::unique_ptr<nix::DerivationOptions> drvOptions;
std::unique_ptr<nix::ParsedDerivation> parsedDrv;
std::set<std::string> requiredSystemFeatures;
bool preferLocalBuild;
bool isDeterministic;
@@ -244,6 +243,10 @@ struct Machine : nix::Machine
{
typedef std::shared_ptr<Machine> ptr;
/* TODO Get rid of: `nix::Machine::storeUri` is normalized in a way
we are not yet used to, but once we are, we don't need this. */
std::string sshName;
struct State {
typedef std::shared_ptr<State> ptr;
counter currentJobs{0};
@@ -293,7 +296,11 @@ struct Machine : nix::Machine
return true;
}
bool isLocalhost() const;
bool isLocalhost()
{
std::regex r("^(ssh://|ssh-ng://)?localhost$");
return std::regex_search(sshName, r);
}
// A connection to a machine
struct Connection : nix::ServeProto::BasicClientConnection {
@@ -353,7 +360,7 @@ private:
/* The build machines. */
std::mutex machinesReadyLock;
typedef std::map<nix::StoreReference::Variant, Machine::ptr> Machines;
typedef std::map<std::string, Machine::ptr> Machines;
nix::Sync<Machines> machines; // FIXME: use atomic_shared_ptr
/* Throttler for CPU-bound local work. */
@@ -400,6 +407,7 @@ private:
struct MachineReservation
{
typedef std::shared_ptr<MachineReservation> ptr;
State & state;
Step::ptr step;
Machine::ptr machine;
@@ -464,12 +472,6 @@ private:
prometheus::Counter& queue_monitor_time_spent_running;
prometheus::Counter& queue_monitor_time_spent_waiting;
prometheus::Counter& dispatcher_time_spent_running;
prometheus::Counter& dispatcher_time_spent_waiting;
prometheus::Counter& queue_monitor_time_spent_running;
prometheus::Counter& queue_monitor_time_spent_waiting;
PromMetrics();
};
PromMetrics prom;
@@ -555,17 +557,17 @@ private:
void abortUnsupported();
void builder(std::unique_ptr<MachineReservation> reservation);
void builder(MachineReservation::ptr reservation);
/* Perform the given build step. Return true if the step is to be
retried. */
enum StepResult { sDone, sRetry, sMaybeCancelled };
StepResult doBuildStep(nix::ref<nix::Store> destStore,
std::unique_ptr<MachineReservation> reservation,
MachineReservation::ptr & reservation,
std::shared_ptr<ActiveStep> activeStep);
void buildRemote(nix::ref<nix::Store> destStore,
std::unique_ptr<MachineReservation> reservation,
MachineReservation::ptr & reservation,
Machine::ptr machine, Step::ptr step,
const nix::ServeProto::BuildOptions & buildOptions,
RemoteResult & result, std::shared_ptr<ActiveStep> activeStep,

View File

@@ -6,7 +6,6 @@ use base 'Catalyst::View::TT';
use Template::Plugin::HTML;
use Hydra::Helper::Nix;
use Time::Seconds;
use Digest::SHA qw(sha1_hex);
__PACKAGE__->config(
TEMPLATE_EXTENSION => '.tt',
@@ -26,14 +25,8 @@ __PACKAGE__->config(
makeNameTextForJobset
relativeDuration
stripSSHUser
metricDivId
/]);
sub metricDivId {
my ($self, $c, $text) = @_;
return "metric-" . sha1_hex($text);
}
sub buildLogExists {
my ($self, $c, $build) = @_;
return 1 if defined $c->config->{log_prefix};

View File

@@ -2,8 +2,8 @@
#include <pqxx/pqxx>
#include <nix/util/environment-variables.hh>
#include <nix/util/util.hh>
#include "environment-variables.hh"
#include "util.hh"
struct Connection : pqxx::connection

View File

@@ -2,8 +2,8 @@
#include <map>
#include <nix/util/file-system.hh>
#include <nix/util/util.hh>
#include "file-system.hh"
#include "util.hh"
struct HydraConfig
{

View File

@@ -13,12 +13,12 @@
<div class="tab-content tab-pane">
<div id="tabs-errors" class="">
[% IF eval %]
<p>Errors occurred at [% INCLUDE renderDateTime timestamp=(eval.evaluationerror.errortime || eval.timestamp) %].</p>
<div class="card bg-light"><div class="card-body"><pre>[% HTML.escape(eval.evaluationerror.errormsg) %]</pre></div></div>
[% ELSIF jobset %]
[% IF jobset %]
<p>Errors occurred at [% INCLUDE renderDateTime timestamp=(jobset.errortime || jobset.lastcheckedtime) %].</p>
<div class="card bg-light"><div class="card-body"><pre>[% HTML.escape(jobset.fetcherrormsg || jobset.errormsg) %]</pre></div></div>
[% ELSIF eval %]
<p>Errors occurred at [% INCLUDE renderDateTime timestamp=(eval.evaluationerror.errortime || eval.timestamp) %].</p>
<div class="card bg-light"><div class="card-body"><pre>[% HTML.escape(eval.evaluationerror.errormsg) %]</pre></div></div>
[% END %]
</div>
</div>

View File

@@ -18,7 +18,8 @@
<h3>Metric: <a [% HTML.attributes(href => c.uri_for('/job' project.name jobset.name job 'metric' metric.name)) %]><tt>[%HTML.escape(metric.name)%]</tt></a></h3>
[% id = metricDivId(metric.name);
[% id = "metric-" _ metric.name;
id = id.replace('\.', '_');
INCLUDE createChart dataUrl=c.uri_for('/job' project.name jobset.name job 'metric' metric.name); %]
[% END %]

View File

@@ -773,9 +773,6 @@ sub checkJobsetWrapped {
my $jobsetChanged = 0;
my %buildMap;
my @jobs;
push @jobs, $_ while defined($_ = $jobsIter->());
$db->txn_do(sub {
my $prevEval = getPrevJobsetEval($db, $jobset, 1);
@@ -799,7 +796,7 @@ sub checkJobsetWrapped {
my @jobsWithConstituents;
foreach my $job (@jobs) {
while (defined(my $job = $jobsIter->())) {
if ($jobsetsJobset) {
die "The .jobsets jobset must only have a single job named 'jobsets'"
unless $job->{attr} eq "jobsets";

View File

@@ -6,55 +6,27 @@ use Hydra::Helper::Exec;
my $ctx = test_context();
subtest "broken constituents expression" => sub {
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-broken.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-broken.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
isnt($res, 0, "hydra-eval-jobset exits non-zero");
ok(utf8::decode($stderr), "Stderr output is UTF8-clean");
like(
$stderr,
qr/aggregate job 'mixed_aggregate' references non-existent job 'constituentA'/,
"The stderr record includes a relevant error message"
);
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
isnt($res, 0, "hydra-eval-jobset exits non-zero");
ok(utf8::decode($stderr), "Stderr output is UTF8-clean");
like(
$stderr,
qr/aggregate job mixed_aggregate failed with the error: "constituentA": does not exist/,
"The stderr record includes a relevant error message"
);
$jobset->discard_changes({ '+columns' => {'errormsg' => 'errormsg'} }); # refresh from DB
like(
$jobset->errormsg,
qr/aggregate job mixed_aggregate failed with the error: constituentA: does not exist/,
"The jobset records a relevant error message"
);
};
subtest "no matches" => sub {
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-no-matches.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
isnt($res, 0, "hydra-eval-jobset exits non-zero");
ok(utf8::decode($stderr), "Stderr output is UTF8-clean");
like(
$stderr,
qr/aggregate job 'non_match_aggregate' references constituent glob pattern 'tests\.\*' with no matches/,
"The stderr record includes a relevant error message"
);
$jobset->discard_changes({ '+columns' => {'errormsg' => 'errormsg'} }); # refresh from DB
like(
$jobset->errormsg,
qr/aggregate job non_match_aggregate failed with the error: tests\.\*: constituent glob pattern had no matches/,
qr/in job non_match_aggregate:\ntests\.\*: constituent glob pattern had no matches/,
"The jobset records a relevant error message"
);
};
$jobset->discard_changes({ '+columns' => {'errormsg' => 'errormsg'} }); # refresh from DB
like(
$jobset->errormsg,
qr/aggregate job mixed_aggregate failed with the error: "constituentA": does not exist/,
"The jobset records a relevant error message"
);
done_testing;

View File

@@ -1,138 +0,0 @@
use strict;
use warnings;
use Setup;
use Test2::V0;
use Hydra::Helper::Exec;
use Data::Dumper;
my $ctx = test_context();
subtest "general glob testing" => sub {
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-glob.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
is($res, 0, "hydra-eval-jobset exits zero");
my $builds = {};
for my $build ($jobset->builds) {
$builds->{$build->job} = $build;
}
subtest "basic globbing works" => sub {
ok(defined $builds->{"ok_aggregate"}, "'ok_aggregate' is part of the jobset evaluation");
my @constituents = $builds->{"ok_aggregate"}->constituents->all;
is(2, scalar @constituents, "'ok_aggregate' has two constituents");
my @sortedConstituentNames = sort (map { $_->nixname } @constituents);
is($sortedConstituentNames[0], "empty-dir-A", "first constituent of 'ok_aggregate' is 'empty-dir-A'");
is($sortedConstituentNames[1], "empty-dir-B", "second constituent of 'ok_aggregate' is 'empty-dir-B'");
};
subtest "transitivity is OK" => sub {
ok(defined $builds->{"indirect_aggregate"}, "'indirect_aggregate' is part of the jobset evaluation");
my @constituents = $builds->{"indirect_aggregate"}->constituents->all;
is(1, scalar @constituents, "'indirect_aggregate' has one constituent");
is($constituents[0]->nixname, "direct_aggregate", "'indirect_aggregate' has 'direct_aggregate' as single constituent");
};
};
subtest "* selects all except current aggregate" => sub {
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-glob-all.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
subtest "no eval errors" => sub {
ok(utf8::decode($stderr), "Stderr output is UTF8-clean");
ok(
$stderr !~ "aggregate job ok_aggregate has a constituent .* that doesn't correspond to a Hydra build",
"Catchall wildcard must not select itself as constituent"
);
$jobset->discard_changes; # refresh from DB
is(
$jobset->has_error,
0,
"eval-errors non-empty"
);
};
my $builds = {};
for my $build ($jobset->builds) {
$builds->{$build->job} = $build;
}
subtest "two constituents" => sub {
ok(defined $builds->{"ok_aggregate"}, "'ok_aggregate' is part of the jobset evaluation");
my @constituents = $builds->{"ok_aggregate"}->constituents->all;
is(2, scalar @constituents, "'ok_aggregate' has two constituents");
my @sortedConstituentNames = sort (map { $_->nixname } @constituents);
is($sortedConstituentNames[0], "empty-dir-A", "first constituent of 'ok_aggregate' is 'empty-dir-A'");
is($sortedConstituentNames[1], "empty-dir-B", "second constituent of 'ok_aggregate' is 'empty-dir-B'");
};
};
subtest "trivial cycle check" => sub {
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-cycle.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
ok(
$stderr =~ "Found dependency cycle between jobs 'indirect_aggregate' and 'ok_aggregate'",
"Dependency cycle error is on stderr"
);
ok(utf8::decode($stderr), "Stderr output is UTF8-clean");
$jobset->discard_changes({ '+columns' => {'errormsg' => 'errormsg'} }); # refresh from DB
like(
$jobset->errormsg,
qr/Dependency cycle: indirect_aggregate <-> ok_aggregate/,
"eval-errors non-empty"
);
is(0, $jobset->builds->count, "No builds should be scheduled");
};
subtest "cycle check with globbing" => sub {
my $jobsetCtx = $ctx->makeJobset(
expression => 'constituents-cycle-glob.nix',
);
my $jobset = $jobsetCtx->{"jobset"};
my ($res, $stdout, $stderr) = captureStdoutStderr(60,
("hydra-eval-jobset", $jobsetCtx->{"project"}->name, $jobset->name)
);
ok(utf8::decode($stderr), "Stderr output is UTF8-clean");
$jobset->discard_changes({ '+columns' => {'errormsg' => 'errormsg'} }); # refresh from DB
like(
$jobset->errormsg,
qr/aggregate job indirect_aggregate failed with the error: Dependency cycle: indirect_aggregate <-> packages.constituentA/,
"packages.constituentA error missing"
);
# on this branch of Hydra, hydra-eval-jobset fails hard if an aggregate
# job is broken.
is(0, $jobset->builds->count, "Zero jobs are scheduled");
};
done_testing;

View File

@@ -1,14 +0,0 @@
rec {
path = "/nix/store/l9mg93sgx50y88p5rr6x1vib6j1rjsds-coreutils-9.1/bin";
mkDerivation = args:
derivation ({
system = builtins.currentSystem;
PATH = path;
} // args);
mkContentAddressedDerivation = args: mkDerivation ({
__contentAddressed = true;
outputHashMode = "recursive";
outputHashAlgo = "sha256";
} // args);
}

View File

@@ -1,34 +0,0 @@
with import ./config.nix;
{
packages.constituentA = mkDerivation {
name = "empty-dir-A";
builder = ./empty-dir-builder.sh;
_hydraAggregate = true;
_hydraGlobConstituents = true;
constituents = [ "*_aggregate" ];
};
packages.constituentB = mkDerivation {
name = "empty-dir-B";
builder = ./empty-dir-builder.sh;
};
ok_aggregate = mkDerivation {
name = "direct_aggregate";
_hydraAggregate = true;
_hydraGlobConstituents = true;
constituents = [
"packages.*"
];
builder = ./empty-dir-builder.sh;
};
indirect_aggregate = mkDerivation {
name = "indirect_aggregate";
_hydraAggregate = true;
constituents = [
"ok_aggregate"
];
builder = ./empty-dir-builder.sh;
};
}

View File

@@ -1,21 +0,0 @@
with import ./config.nix;
{
ok_aggregate = mkDerivation {
name = "direct_aggregate";
_hydraAggregate = true;
_hydraGlobConstituents = true;
constituents = [
"indirect_aggregate"
];
builder = ./empty-dir-builder.sh;
};
indirect_aggregate = mkDerivation {
name = "indirect_aggregate";
_hydraAggregate = true;
constituents = [
"ok_aggregate"
];
builder = ./empty-dir-builder.sh;
};
}

View File

@@ -1,22 +0,0 @@
with import ./config.nix;
{
packages.constituentA = mkDerivation {
name = "empty-dir-A";
builder = ./empty-dir-builder.sh;
};
packages.constituentB = mkDerivation {
name = "empty-dir-B";
builder = ./empty-dir-builder.sh;
};
ok_aggregate = mkDerivation {
name = "direct_aggregate";
_hydraAggregate = true;
_hydraGlobConstituents = true;
constituents = [
"*"
];
builder = ./empty-dir-builder.sh;
};
}

View File

@@ -1,31 +0,0 @@
with import ./config.nix;
{
packages.constituentA = mkDerivation {
name = "empty-dir-A";
builder = ./empty-dir-builder.sh;
};
packages.constituentB = mkDerivation {
name = "empty-dir-B";
builder = ./empty-dir-builder.sh;
};
ok_aggregate = mkDerivation {
name = "direct_aggregate";
_hydraAggregate = true;
_hydraGlobConstituents = true;
constituents = [
"packages.*"
];
builder = ./empty-dir-builder.sh;
};
indirect_aggregate = mkDerivation {
name = "indirect_aggregate";
_hydraAggregate = true;
constituents = [
"ok_aggregate"
];
builder = ./empty-dir-builder.sh;
};
}

View File

@@ -1,20 +0,0 @@
with import ./config.nix;
{
non_match_aggregate = mkDerivation {
name = "mixed_aggregate";
_hydraAggregate = true;
_hydraGlobConstituents = true;
constituents = [
"tests.*"
];
builder = ./empty-dir-builder.sh;
};
# Without a second job no jobset is attempted to be created
# (the only job would be broken)
# and thus the constituent validation is never reached.
dummy = mkDerivation {
name = "dummy";
builder = ./empty-dir-builder.sh;
};
}

View File

@@ -1,24 +0,0 @@
{
"enabled": 1,
"hidden": false,
"description": "declarative-jobset-example",
"nixexprinput": "src",
"nixexprpath": "declarative/generator.nix",
"checkinterval": 300,
"schedulingshares": 100,
"enableemail": false,
"emailoverride": "",
"keepnr": 3,
"inputs": {
"src": {
"type": "path",
"value": "/home/ma27/Projects/hydra-cppnix/t/jobs",
"emailresponsible": false
},
"jobspath": {
"type": "string",
"value": "/home/ma27/Projects/hydra-cppnix/t/jobs",
"emailresponsible": false
}
}
}

View File

@@ -27,7 +27,6 @@ testenv.prepend('PERL5LIB',
separator: ':'
)
testenv.prepend('PATH',
fs.parent(find_program('nix').full_path()),
fs.parent(hydra_evaluator.full_path()),
fs.parent(hydra_queue_runner.full_path()),
meson.project_source_root() / 'src/script',

View File

@@ -22,11 +22,11 @@ is(nrQueuedBuildsForJobset($jobset), 0, "Evaluating jobs/broken-constituent.nix
like(
$jobset->errormsg,
qr/^does-not-exist: does not exist$/m,
qr/^"does-not-exist": does not exist$/m,
"Evaluating jobs/broken-constituent.nix should log an error for does-not-exist");
like(
$jobset->errormsg,
qr/^does-not-evaluate: error: assertion 'false' failed/m,
qr/^"does-not-evaluate": "error: assertion 'false' failed/m,
"Evaluating jobs/broken-constituent.nix should log an error for does-not-evaluate");
done_testing;