THIS IS A TEST INSTANCE ONLY! REPOSITORIES CAN BE DELETED AT ANY TIME!

Git Source Code Mirror - This is a publish-only repository and all pull requests are ignored. Please follow Documentation/SubmittingPatches procedure for any of your improvements.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
[PATCH] daemon.c and path.enter_repo(): revamp path validation. The whitelist of git-daemon is checked against return value from enter_repo(), and enter_repo() used to return the value obtained from getcwd() to avoid directory aliasing issues as discussed earier (mid October 2005). Unfortunately, it did not go well as we hoped. For example, /pub on a kernel.org public machine is a symlink to its real mountpoint, and it is understandable that the administrator does not want to adjust the whitelist every time /pub needs to point at a different partition for storage allcation or whatever reasons. Being able to keep using /pub/scm as the whitelist is a desirable property. So this version of enter_repo() reports what it used to chdir() and validate, but does not use getcwd() to canonicalize the directory name. When it sees a user relative path ~user/path, it internally resolves it to try chdir() there, but it still reports ~user/path (possibly after appending .git if allowed to do so, in which case it would report ~user/path.git). What this means is that if a whitelist wants to allow a user relative path, it needs to say "~" (for all users) or list user home directories like "~alice" "~bob". And no, you cannot say /home if the advertised way to access user home directories are ~alice,~bob, etc. The whole point of this is to avoid unnecessary aliasing issues. Anyway, because of this, daemon needs to do a bit more work to guard itself. Namely, it needs to make sure that the accessor does not try to exploit its leading path match rule by inserting /../ in the middle or hanging /.. at the end. I resurrected the belts and suspender paranoia code HPA did for this purpose. This check cannot be done in the enter_repo() unconditionally, because there are valid callers of enter_repo() that want to honor /../; authorized users coming over ssh to run send-pack and fetch-pack should be allowed to do so. Signed-off-by: Junio C Hamano <junkio@cox.net>
14 years ago
[PATCH] daemon.c and path.enter_repo(): revamp path validation. The whitelist of git-daemon is checked against return value from enter_repo(), and enter_repo() used to return the value obtained from getcwd() to avoid directory aliasing issues as discussed earier (mid October 2005). Unfortunately, it did not go well as we hoped. For example, /pub on a kernel.org public machine is a symlink to its real mountpoint, and it is understandable that the administrator does not want to adjust the whitelist every time /pub needs to point at a different partition for storage allcation or whatever reasons. Being able to keep using /pub/scm as the whitelist is a desirable property. So this version of enter_repo() reports what it used to chdir() and validate, but does not use getcwd() to canonicalize the directory name. When it sees a user relative path ~user/path, it internally resolves it to try chdir() there, but it still reports ~user/path (possibly after appending .git if allowed to do so, in which case it would report ~user/path.git). What this means is that if a whitelist wants to allow a user relative path, it needs to say "~" (for all users) or list user home directories like "~alice" "~bob". And no, you cannot say /home if the advertised way to access user home directories are ~alice,~bob, etc. The whole point of this is to avoid unnecessary aliasing issues. Anyway, because of this, daemon needs to do a bit more work to guard itself. Namely, it needs to make sure that the accessor does not try to exploit its leading path match rule by inserting /../ in the middle or hanging /.. at the end. I resurrected the belts and suspender paranoia code HPA did for this purpose. This check cannot be done in the enter_repo() unconditionally, because there are valid callers of enter_repo() that want to honor /../; authorized users coming over ssh to run send-pack and fetch-pack should be allowed to do so. Signed-off-by: Junio C Hamano <junkio@cox.net>
14 years ago
add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
[PATCH] daemon.c and path.enter_repo(): revamp path validation. The whitelist of git-daemon is checked against return value from enter_repo(), and enter_repo() used to return the value obtained from getcwd() to avoid directory aliasing issues as discussed earier (mid October 2005). Unfortunately, it did not go well as we hoped. For example, /pub on a kernel.org public machine is a symlink to its real mountpoint, and it is understandable that the administrator does not want to adjust the whitelist every time /pub needs to point at a different partition for storage allcation or whatever reasons. Being able to keep using /pub/scm as the whitelist is a desirable property. So this version of enter_repo() reports what it used to chdir() and validate, but does not use getcwd() to canonicalize the directory name. When it sees a user relative path ~user/path, it internally resolves it to try chdir() there, but it still reports ~user/path (possibly after appending .git if allowed to do so, in which case it would report ~user/path.git). What this means is that if a whitelist wants to allow a user relative path, it needs to say "~" (for all users) or list user home directories like "~alice" "~bob". And no, you cannot say /home if the advertised way to access user home directories are ~alice,~bob, etc. The whole point of this is to avoid unnecessary aliasing issues. Anyway, because of this, daemon needs to do a bit more work to guard itself. Namely, it needs to make sure that the accessor does not try to exploit its leading path match rule by inserting /../ in the middle or hanging /.. at the end. I resurrected the belts and suspender paranoia code HPA did for this purpose. This check cannot be done in the enter_repo() unconditionally, because there are valid callers of enter_repo() that want to honor /../; authorized users coming over ssh to run send-pack and fetch-pack should be allowed to do so. Signed-off-by: Junio C Hamano <junkio@cox.net>
14 years ago
daemon: Strictly parse the "extra arg" part of the command Since 1.4.4.5 (49ba83fb67 "Add virtualization support to git-daemon") git daemon enters an infinite loop and never terminates if a client hides any extra arguments in the initial request line which is not exactly "\0host=blah\0". Since that change, a client must never insert additional extra arguments, or attempt to use any argument other than "host=", as any daemon will get stuck parsing the request line and will never complete the request. Since the client can't tell if the daemon is patched or not, it is not possible to know if additional extra args might actually be able to be safely requested. If we ever need to extend the git daemon protocol to support a new feature, we may have to do something like this to the exchange: # If both support git:// v2 # C: 000cgit://v2 S: 0010ok host user C: 0018host git.kernel.org C: 0027git-upload-pack /pub/linux-2.6.git S: ...git-upload-pack header... # If client supports git:// v2, server does not: # C: 000cgit://v2 S: <EOF> C: 003bgit-upload-pack /pub/linux-2.6.git\0host=git.kernel.org\0 S: ...git-upload-pack header... This requires the client to create two TCP connections to talk to an older git daemon, however all daemons since the introduction of daemon.c will safely reject the unknown "git://v2" command request, so the client can quite easily determine the server supports an older protocol. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
daemon: Strictly parse the "extra arg" part of the command Since 1.4.4.5 (49ba83fb67 "Add virtualization support to git-daemon") git daemon enters an infinite loop and never terminates if a client hides any extra arguments in the initial request line which is not exactly "\0host=blah\0". Since that change, a client must never insert additional extra arguments, or attempt to use any argument other than "host=", as any daemon will get stuck parsing the request line and will never complete the request. Since the client can't tell if the daemon is patched or not, it is not possible to know if additional extra args might actually be able to be safely requested. If we ever need to extend the git daemon protocol to support a new feature, we may have to do something like this to the exchange: # If both support git:// v2 # C: 000cgit://v2 S: 0010ok host user C: 0018host git.kernel.org C: 0027git-upload-pack /pub/linux-2.6.git S: ...git-upload-pack header... # If client supports git:// v2, server does not: # C: 000cgit://v2 S: <EOF> C: 003bgit-upload-pack /pub/linux-2.6.git\0host=git.kernel.org\0 S: ...git-upload-pack header... This requires the client to create two TCP connections to talk to an older git daemon, however all daemons since the introduction of daemon.c will safely reject the unknown "git://v2" command request, so the client can quite easily determine the server supports an older protocol. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
daemon: Strictly parse the "extra arg" part of the command Since 1.4.4.5 (49ba83fb67 "Add virtualization support to git-daemon") git daemon enters an infinite loop and never terminates if a client hides any extra arguments in the initial request line which is not exactly "\0host=blah\0". Since that change, a client must never insert additional extra arguments, or attempt to use any argument other than "host=", as any daemon will get stuck parsing the request line and will never complete the request. Since the client can't tell if the daemon is patched or not, it is not possible to know if additional extra args might actually be able to be safely requested. If we ever need to extend the git daemon protocol to support a new feature, we may have to do something like this to the exchange: # If both support git:// v2 # C: 000cgit://v2 S: 0010ok host user C: 0018host git.kernel.org C: 0027git-upload-pack /pub/linux-2.6.git S: ...git-upload-pack header... # If client supports git:// v2, server does not: # C: 000cgit://v2 S: <EOF> C: 003bgit-upload-pack /pub/linux-2.6.git\0host=git.kernel.org\0 S: ...git-upload-pack header... This requires the client to create two TCP connections to talk to an older git daemon, however all daemons since the introduction of daemon.c will safely reject the unknown "git://v2" command request, so the client can quite easily determine the server supports an older protocol. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
pkt-line: share buffer/descriptor reading implementation The packet_read function reads from a descriptor. The packet_get_line function is similar, but reads from an in-memory buffer, and uses a completely separate implementation. This patch teaches the generic packet_read function to accept either source, and we can do away with packet_get_line's implementation. There are two other differences to account for between the old and new functions. The first is that we used to read into a strbuf, but now read into a fixed size buffer. The only two callers are fine with that, and in fact it simplifies their code, since they can use the same static-buffer interface as the rest of the packet_read_line callers (and we provide a similar convenience wrapper for reading from a buffer rather than a descriptor). This is technically an externally-visible behavior change in that we used to accept arbitrary sized packets up to 65532 bytes, and now cap out at LARGE_PACKET_MAX, 65520. In practice this doesn't matter, as we use it only for parsing smart-http headers (of which there is exactly one defined, and it is small and fixed-size). And any extension headers would be breaking the protocol to go over LARGE_PACKET_MAX anyway. The other difference is that packet_get_line would return on error rather than dying. However, both callers of packet_get_line are actually improved by dying. The first caller does its own error checking, but we can drop that; as a result, we'll actually get more specific reporting about protocol breakage when packet_read dies internally. The only downside is that packet_read will not print the smart-http URL that failed, but that's not a big deal; anybody not debugging can already see the remote's URL already, and anybody debugging would want to run with GIT_CURL_VERBOSE anyway to see way more information. The second caller, which is just trying to skip past any extra smart-http headers (of which there are none defined, but which we allow to keep room for future expansion), did not error check at all. As a result, it would treat an error just like a flush packet. The resulting mess would generally cause an error later in get_remote_heads, but now we get error reporting much closer to the source of the problem. Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
daemon: fix length computation in newline stripping When git-daemon gets a pktline request, we strip off any trailing newline, replacing it with a NUL. Clients prior to 5ad312bede (in git v1.4.0) would send: git-upload-pack repo.git\n and we need to strip it off to understand their request. After 5ad312bede, we send the host attribute but no newline, like: git-upload-pack repo.git\0host=example.com\0 Both of these are parsed correctly by git-daemon. But if some client were to combine the two: git-upload-pack repo.git\n\0host=example.com\0 we don't parse it correctly. The problem is that we use the "len" variable to record the position of the NUL separator, but then decrement it when we strip the newline. So we start with: git-upload-pack repo.git\n\0host=example.com\0 ^-- len and end up with: git-upload-pack repo.git\0\0host=example.com\0 ^-- len This is arguably correct, since "len" tells us the length of the initial string, but we don't actually use it for that. What we do use it for is finding the offset of the extended attributes; they used to be at len+1, but are now at len+2. We can solve that by just leaving "len" where it is. We don't have to care about the length of the shortened string, since we just treat it like a C string. No version of Git ever produced such a string, but it seems like the daemon code meant to handle this case (and it seems like a reasonable thing for somebody to do in a 3rd-party implementation). Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago
daemon: add --log-destination=(stderr|syslog|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if *all* implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 year ago