Here’s a little SQL problem for the lazy web. It’s something that I’m trying to implement in Glom for the Repository Analyzer:
Let’s say we have these tables:
Packages:
package_name |
package_description |
something |
something description |
somethingelse |
something else description |
somethingmore |
something more description |
and Package Scans:
package_name |
version |
license_id |
something |
0.1 |
43 |
something |
0.2 |
43 |
something |
0.3 |
44 |
somethingelse |
1.5 |
43 |
somethingmore |
0.9 |
40 |
Now, I want to get the package.package_description for all packages that appear in package_scans with license_id 43, which would look like this:
‘something description’
‘something else description’
The best I can do so far is a SELECT on package_scans, doing a LEFT OUTER JOIN:
SELECT “relationship_package”.”package_description” FROM “package_scans” LEFT OUTER JOIN “packages” AS “relationship_package” ON (“package_scans”.”package_name” = “relationship_package”.”package_name”) WHERE “package_scans”.”license_id” = 43;
which gives me duplicates, like so:
‘something description’
‘something description’
‘something else description’
If possible I’d like to do this without GROUP BY. I feel there must be a simpler way to say “give me a row for each record in packages for which the (indirect) relationship is true”. If the main FROM table could somehow be packages, instead of package_scans, then the LEFT OUTER JOIN would cause me to have only one row for each relevant packages record. In general, Glom never gives you repeat rows because that’s confusing.
If I can figure out what SQL should be generated, I could imagine that I might get that result in Glom by defining a relationship in terms of a doubly-related field. So the user could say “Show me records from the packages_with_package_scan_license_id” relationship. That relationship would be defined as something like
records from packages where licenses.license_id == packages::package_scans::license_id
But maybe a GROUP BY option really is the clearest.
Update: I feel like the sub-select idea might be what I want, if I can figure the syntax out. I like the idea of showing records from a relationship that is itself defined by a link between a value and a field in another related set of records.