Are you over 18 and want to see adult content?
More Annotations
![A complete backup of hometechexperts.ca](https://www.archivebay.com/archive/45f1198a-bfa7-415b-97f0-bb757ea8020e.png)
A complete backup of hometechexperts.ca
Are you over 18 and want to see adult content?
![Restaurant, Bar, Kneipe, Club oder Café in Münster suchen und finden](https://www.archivebay.com/archive/7d9bcbbc-2a9d-47b1-9469-5db56fee10d2.png)
Restaurant, Bar, Kneipe, Club oder Café in Münster suchen und finden
Are you over 18 and want to see adult content?
![A complete backup of redneckblinds.com](https://www.archivebay.com/archive/a6590464-1c32-4970-ba4b-3df502da8d99.png)
A complete backup of redneckblinds.com
Are you over 18 and want to see adult content?
![Open Educational Resources | Lyryx with Open Texts & Online Homework](https://www.archivebay.com/archive/42084382-48d6-45e1-bea8-385fa808462c.png)
Open Educational Resources | Lyryx with Open Texts & Online Homework
Are you over 18 and want to see adult content?
Favourite Annotations
![BJJ & Muay Thai | Balcatta | The Academy of Mixed Martial Arts](https://www.archivebay.com/archive/d5c8b416-ffde-4841-a7af-e197187f9927.png)
BJJ & Muay Thai | Balcatta | The Academy of Mixed Martial Arts
Are you over 18 and want to see adult content?
![A complete backup of westlawnextcanada.com](https://www.archivebay.com/archive/ce018ce4-a019-423f-a627-a8d71c0f8f6b.png)
A complete backup of westlawnextcanada.com
Are you over 18 and want to see adult content?
![Welcome to Crispy Minis® | CrispyMinis.ca](https://www.archivebay.com/archive/ed0423d6-e094-437a-8757-877b6369f8d1.png)
Welcome to Crispy Minis® | CrispyMinis.ca
Are you over 18 and want to see adult content?
![intuitiveforagerblog](https://www.archivebay.com/archive/eb1c7933-e0c7-4c48-a6ba-6e02034dbabc.png)
intuitiveforagerblog
Are you over 18 and want to see adult content?
![PHP Training in Kolkata | Web Design Course in Kolkata| MEAN Stack Training in Kolkata](https://www.archivebay.com/archive/9fb3ba84-05fe-4d56-88a3-6955fd882ec4.png)
PHP Training in Kolkata | Web Design Course in Kolkata| MEAN Stack Training in Kolkata
Are you over 18 and want to see adult content?
![Bondage website Transvestite bondage crossdressers tied up sissy girls bound and gagged](https://www.archivebay.com/archive/5ddb9428-5010-438b-9558-5dab4a9b0f49.png)
Bondage website Transvestite bondage crossdressers tied up sissy girls bound and gagged
Are you over 18 and want to see adult content?
![Art Jewelry in Metal and Macrame by Designer Coco Paniora Salinas](https://www.archivebay.com/archive/b939e766-e1a4-4425-a4ea-f6903b830271.png)
Art Jewelry in Metal and Macrame by Designer Coco Paniora Salinas
Are you over 18 and want to see adult content?
Text
around.
SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and EMBEDDING JAVASCRIPT IN A JUPYTER NOTEBOOK Embedding JavaScript in a Jupyter notebook. I recently found out modern browsers include a JavaScript API for creating public/private keys for cryptography. SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a CROSS-DATABASE QUERIES IN SQLITE (AND WEEKNOTES) Cross-database queries in SQLite (and weeknotes) I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.. Cross-database queries in Datasette. SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and movearound.
SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and EMBEDDING JAVASCRIPT IN A JUPYTER NOTEBOOK Embedding JavaScript in a Jupyter notebook. I recently found out modern browsers include a JavaScript API for creating public/private keys for cryptography. SIMON WILLISON’S WEBLOG Weeknotes: Docker architectures, sqlite-utils 3.7, nearly there with Datasette 0.57 six days ago. This week I learned a whole bunch about using Docker to emulate different architectures, released sqlite-utils 3.7 and made a ton of progress towards the almost-ready-to-shipDatasette 0.57.
REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case USING A SELF-REWRITING README POWERED BY GITHUB ACTIONS TO Using a self-rewriting README powered by GitHub Actions to track TILs. I’ve started tracking TILs—Today I Learneds—inspired by this five-year-and-counting collection by Josh Branchaud on GitHub (found via Hacker News).I’m keeping mine in GitHub too, and using GitHub Actions to automatically generate an index page README in the repository and a SQLite-backed search engine. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
ENABLING WAL MODE FOR SQLITE DATABASE FILES Enabling WAL mode for SQLite database files. I was getting occasional Error: database is locked messages from a Datasette instance that was running against a bunch of different SQLite files that were updated by cron scripts (my personal Dogsheep).. I had read about SQLite's WAL mode but never fully understood how it works. DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they EMBEDDING JAVASCRIPT IN A JUPYTER NOTEBOOK Embedding JavaScript in a Jupyter notebook. I recently found out modern browsers include a JavaScript API for creating public/private keys for cryptography. SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. CROSS-DATABASE QUERIES IN SQLITE (AND WEEKNOTES) Cross-database queries in SQLite (and weeknotes) I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.. Cross-database queries in Datasette. SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and movearound.
WEEKNOTES: AIRTABLE-EXPORT, GENERATING SCREENSHOTS IN Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep! This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with myDogsheep project.
A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result.SIMON WILLISON: TIL
Simon Willison: TIL. Things I've learned, collected in simonw/til. You may also enjoy my blog. Atom feed. Recently added: Running Docker on a remote M1 Mac, Turning an array of arrays into objects with jq, Docker Compose for Django development, Finding duplicate records by matching name and nearby distance, Switching between gcloud accounts. USING SQL TO FIND MY BEST PHOTO OF A PELICAN ACCORDING TO Using SQL to find my best photo of a pelican according to Apple Photos. According to the Apple Photos internal SQLite database, this is the most aesthetically pleasing photograph I have ever taken of apelican:
DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they INSTALLING SELENIUM FOR PYTHON ON MACOS WITH CHROMEDRIVER I needed to run Selenium on macOS for the first time today. Here's how I got it working. ## Install the chromedriver binary ### If you have homebrew This is by far the easiest option: brew cask install chromedriver This also ensures `chromedriver` is on your path, which means you don't need to use an explicit `chromedriver_path` later on. You still need to run it once in the terminal SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
CROSS-DATABASE QUERIES IN SQLITE (AND WEEKNOTES) Cross-database queries in SQLite (and weeknotes) I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.. Cross-database queries in Datasette. SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and movearound.
WEEKNOTES: AIRTABLE-EXPORT, GENERATING SCREENSHOTS IN Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep! This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with myDogsheep project.
SIMON WILLISON: TIL
Simon Willison: TIL. Things I've learned, collected in simonw/til. You may also enjoy my blog. Atom feed. Recently added: Running Docker on a remote M1 Mac, Turning an array of arrays into objects with jq, Docker Compose for Django development, Finding duplicate records by matching name and nearby distance, Switching between gcloud accounts. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they DJANGO ADMIN ACTION FOR EXPORTING SELECTED ROWS AS CSV Django Admin action for exporting selected rows as CSV. I wanted to add an action option to the Django Admin for exporting the currently selected set of rows (or every row in the table) as a SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
CROSS-DATABASE QUERIES IN SQLITE (AND WEEKNOTES) Cross-database queries in SQLite (and weeknotes) I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.. Cross-database queries in Datasette. SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and movearound.
WEEKNOTES: AIRTABLE-EXPORT, GENERATING SCREENSHOTS IN Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep! This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with myDogsheep project.
SIMON WILLISON: TIL
Simon Willison: TIL. Things I've learned, collected in simonw/til. You may also enjoy my blog. Atom feed. Recently added: Running Docker on a remote M1 Mac, Turning an array of arrays into objects with jq, Docker Compose for Django development, Finding duplicate records by matching name and nearby distance, Switching between gcloud accounts. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they DJANGO ADMIN ACTION FOR EXPORTING SELECTED ROWS AS CSV Django Admin action for exporting selected rows as CSV. I wanted to add an action option to the Django Admin for exporting the currently selected set of rows (or every row in the table) as a SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. SIMON WILLISON’S WEBLOG8:12 PM16TH OCTOBER 2020PROJECTSSQLITEXMLDATASETTE geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case THINGS I LEARNED ABOUT SHAPEFILES BUILDING SHAPEFILE-TO-SQLITE Things I learned about shapefiles building shapefile-to-sqlite. The latest in my series of x-to-sqlite tools is shapefile-to-sqlite.I learned a whole bunch of things about the ESRI shapefile format whilebuilding it.
CROSS-DATABASE QUERIES IN SQLITE (AND WEEKNOTES) Cross-database queries in SQLite (and weeknotes) I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.. Cross-database queries in Datasette. SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and movearound.
WEEKNOTES: AIRTABLE-EXPORT, GENERATING SCREENSHOTS IN Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep! This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with myDogsheep project.
SIMON WILLISON: TIL
Simon Willison: TIL. Things I've learned, collected in simonw/til. You may also enjoy my blog. Atom feed. Recently added: Running Docker on a remote M1 Mac, Turning an array of arrays into objects with jq, Docker Compose for Django development, Finding duplicate records by matching name and nearby distance, Switching between gcloud accounts. USING UNNEST() TO USE A COMMA-SEPARATED STRING AS THE Using unnest() to use a comma-separated string as the input to an IN query. django-sql-dashboard lets you define a SQL query plus one or more text inputs that the user can provide in order to execute the query.. I wanted the user to provide a comma-separated list of IDs which I would then use as input to a WHERE column IN query.. I figured out how to do that using the unnest() function and DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they DJANGO ADMIN ACTION FOR EXPORTING SELECTED ROWS AS CSV Django Admin action for exporting selected rows as CSV. I wanted to add an action option to the Django Admin for exporting the currently selected set of rows (or every row in the table) as a VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they USABLE HORIZONTAL SCROLLBARS IN THE DJANGO ADMIN FOR MOUSE Usable horizontal scrollbars in the Django admin for mouse users. I got a complaint from a Windows-with-mouse user of a Django admin project I'm working on: they couldn't see the right hand columns in a table without scrolling horizontally, but since the horizontal scrollbar was only available at the bottom of the page they had to scroll all the way to the bottom first in order to scroll sideways. VIDEO INTRODUCTION TO DATASETTE AND SQLITE-UTILS Video introduction to Datasette and sqlite-utils. I put together a 17 minute video introduction to Datasette and sqlite-utils for FOSDEM 2021, showing how you can use Datasette to explore data, and demonstrating using the sqlite-utils command-line tool to convert a CSV file into a SQLite database, and then publish it using datasette publish.Here’s the video, plus annotated screen captures BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
SERVING MAP TILES FROM SQLITE WITH MBTILES AND DATASETTE-TILESSEE MORE ON SIMONWILLISON.NET A QUOTE FROM USING ASYNC AND AWAIT IN FLASK 2.0 A quote from Using async and await in Flask 2.0. Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result. FUN WITH BINARY DATA AND SQLITE Fun with binary data and SQLite. This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file. DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight. CONTROLLING THE STYLE OF DUMPED YAML USING PYYAML Controlling the style of dumped YAML using PyYAML. I had a list of Python dictionaries I wanted to output as YAML, but I wanted to control the style of the output. DOWNLOADING MAPZEN ELEVATION TILES Downloading MapZen elevation tiles. Via Tony Hirst I found out about MapZen's elevation tiles, which encode elevation data in PNG and other formats.. These days they USABLE HORIZONTAL SCROLLBARS IN THE DJANGO ADMIN FOR MOUSE Usable horizontal scrollbars in the Django admin for mouse users. I got a complaint from a Windows-with-mouse user of a Django admin project I'm working on: they couldn't see the right hand columns in a table without scrolling horizontally, but since the horizontal scrollbar was only available at the bottom of the page they had to scroll all the way to the bottom first in order to scroll sideways. SIMON WILLISON’S WEBLOG geojson-to-sqlite. One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a --nl option for consuming REFACTORING DATABASES WITH SQLITE-UTILS EXTRACT Refactoring databases with sqlite-utils extract. Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE.The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract. BUILDING A SELF-UPDATING PROFILE README FOR GITHUB GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case DJANGO SQL DASHBOARD Django SQL Dashboard. I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme.It can also be used to create saved dashboards that can be published or sharedinternally.
GIT SCRAPING: TRACK CHANGES OVER TIME BY SCRAPING TO A GIT Git scraping: track changes over time by scraping to a Git repository. Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update 5th March 2021: I presented a version of this post as a five minute lightning talk at NICAR 2021, which includes a live coding demo of building a CROSS-DATABASE QUERIES IN SQLITE (AND WEEKNOTES) Cross-database queries in SQLite (and weeknotes) I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.. Cross-database queries in Datasette. SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and movearound.
DATASETTE: INSTANTLY CREATE AND PUBLISH AN API FOR YOUR Datasette: instantly create and publish an API for your SQLite databases. I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.. You can try out out right now at fivethirtyeight.datasettes.com, where you can explore SQLite databases I built from Creative Commons licensed CSV files published by FiveThirtyEight.SIMON WILLISON: TIL
Simon Willison: TIL. Things I've learned, collected in simonw/til. You may also enjoy my blog. Atom feed. Recently added: Running Docker on a remote M1 Mac, Turning an array of arrays into objects with jq, Docker Compose for Django development, Finding duplicate records by matching name and nearby distance, Switching between gcloud accounts. THE BEHAVIORAL CHANGE STAIRWAY MODEL The Behavioral Change Stairway Model. The Behavioral Change Stairway Model. BCSM is the FBI’s model for crisis negotiation, but it looks like it could be a useful negotiation framework for all kinds of other conflict mediation as well. Posted 19th April 2019 at 5:46 pm. communication 4 management 31. ENABLING WAL MODE FOR SQLITE DATABASE FILES Enabling WAL mode for SQLite database files. I was getting occasional Error: database is locked messages from a Datasette instance that was running against a bunch of different SQLite files that were updated by cron scripts (my personal Dogsheep).. I had read about SQLite's WAL mode but never fully understood how it works. SIMON WILLISON’S WEBLOG On github , datasettecloud , visualization , weeknotes ,projects , ...
RECENT ENTRIES
WEEKNOTES: COVID-19, FIRST PYTHON NOTEBOOK, MORE DOGSHEEP, TAILSCALE20 HOURS AGO
My covid-19.datasettes.com project publishes information on COVID-19 cases around the world. The project started out using data from Johns Hopkins CSSE , but last week the New York Times started publishinghigh
quality USA county- and state-level daily numbers to their own repository . Here’s thechange
that added the NY Times data. It’s very easy to use this data to accidentally build misleading things. I’ve been updating the README with links about this—my current favourite is Why It’s So Freaking Hard To Make A Good COVID-19 Model by Maggie Koerth, Laura Bronner and Jasmine Mithani atFiveThirtyEight.
FIRST PYTHON NOTEBOOK Ben Welsh from the LA Times teaches a course called First Python Notebook at journalism conferences such as NICAR. He ran a free online version the course last weekend, and I offered to help out as a TA. Most of the help I provided came before the course: Ben asked attendees to confirm that they had working installations of Python 3 and pipenv, and if they didn’t volunteers such as myself would step in to help. I had Zoom and email conversations with at least ten people to help them get their environments into shape. This XKCD neatly summarizes the problem: One of the most common problems I had to debug was PATH issues: people had installed the software, but due to various environmental differences python3 and pipenv weren’t available on the PATH. Talking people through the obscurities of creating a ~/.bashrc file and using it to define a PATH over-ride really helps emphasize how arcane this kind of knowledge is. I enjoyed this comment: > “Welcome to intro to Tennis. In the first two weeks, we’ll > discuss how to rig a net and resurface a court.”—Claus Wilke>
Ben’s course itself is hands down the best introduction to Python from a Data Journalism perspective I have ever seen. Within an hour of starting the students are using Pandas in a Jupyter notebook to find interesting discrepancies in California campaign finance data. If you want to check it out yourself, the entire four hour workshop isnow on YouTube
and closely
follows the material on firstpythonnotebook.org.
CORONAVIRUS DIARY
We are clearly living through a notable and very painful period of history right now. On the 19th of March (just under two weeks ago, but time is moving both really fast and incredibly slowly right now) I started a personal diary—something I’ve never done before. It lives in an Apple Note and I’m adding around a dozen paragraphs to it every day. I think it’s helping. I’m sure it will be interesting to look back on in a few years time.DOGSHEEP
Much of my development work this past week has gone into my Dogsheep suite of tools for personal analytics. * I upgraded the entire family of tools for compatibility withsqlite-utils 2.x
.
* pocket-to-sqlite got a major upgrade: it now fetches items using Pocket’s API pagination (previously it just tried to pull in 5,000 items in one go) and has the ability to only fetch new items. As a result I’m now running it from cron in my personal Dogsheep instance, so “Save to Pocket” is now my preferred Dogsheep-compatible way of bookmarking content.* twitter-to-sqlite
got a couple of important new features in release 0.20. I
fixed a nasty bug
in the
--since flag where retweets from other accounts could cause new tweets from an account to be ignored. I also added a new count_history table which automatically tracks changes to a Twitter user’s friends, follower and listed counts over time (#40).
I’m also now using Dogsheep for some journalism! I’m working with the Big Local News team at Stanford to help track and archive tweets by a number of different US politicians and health departments relating to the ongoing pandemic. This collaboration resulted in the above improvements to twitter-to-sqlite.TAILSCALE
My personal Dogsheep is currently protected by client certificates,
so only my personal laptop and iPhone (with the right certificates installed) can connect to the web server it is running on. I spent a bit of time this week playing with Tailscale , and I’m _really_ impressed by it. Tailscale is a commercial company built on top of WireGuard , the new approach to VPN tunnels whichjust got merged
into the Linux 5.6 kernel. Tailscale first caught my attention in January when they hired Brad Fitzpatrick.
WireGuard lets you form a private network by having individual hosts exchange public/private keys with each other. Tailscale provides software which manages those keys for you, making it trivial to set up a private network between different nodes. How trivial? It took me less than ten minutes to get a three-node private network running between my iPhone, laptop and a Linux server. I installed the iPhone app, the
Ubuntu package ,
the OS X app
, signed
them all into my Google account and I was done. Each of those devices now has an additional IP address in the 100.x range which they can use to talk to each other. Tailscale guarantees that the IP address will stay constant for each of them. Since the network is public/private key encrypted between the nodes, Tailscale can’t see any of my traffic—they’re purely acting as a key management mechanism. And it’s free: Tailscale charge for networks with multiple users, but a personal network like this is freeof charge.
I’m not running my own personal Dogsheep on it yet, but I’m tempted to switch over. I’d love other people to start running their own personal Dogsheep instances but I’m paranoid about encouraging this when securing them is so important. Tailscale looks like it might be a great solution for making secure personal infrastructure more easily and widely available. 8:29 pm / 1st April 2020 / tailscale , weeknotes , dogsheep , datajournalism , projects , datasette , python , bradfitzpatrick , teaching WEEKNOTES: DATASETTE 0.39 AND MANY OTHER PROJECTSEIGHT DAYS AGO
This week’s theme: Well, I’m not going anywhere. So a ton of progress to report on various projects.DATASETTE 0.39
This evening I shipped Datasette 0.39. The
two big features are a mechanism for setting the default sort order for tables and a new base_url configuration setting. You can see the new default sort order in action on my Covid-19project —the
daily reports now default to sort by day descending so the most recent figures show up first. Here’s the metadatathat makes it
happen, and here’s the new documentation.
I had to do some extra work on that project this morning when the underlying data changed its CSV column headingswithout
warning.
The base_url feature has been an open issue since Janunary 2019. It lets you run Datasette behind a proxy on a different URL prefix—/tools/datasette/ for example. The trigger for finally getting this solved was a Twitter conversationabout running
Datasette on Binder in coordination with a Jupyter notebook. Tony Hirst did some work on this last year, but was stumped by the lack of a base_url equivalent. Terry Jones shared an implementation in December. I finally found the inspiration to pull it all together, and ended up wih aworking fork
of
Tony’s project which does indeed launch Datasette on Binder—try launching your own here.
GITHUB-TO-SQLITE
I’ve not done much work on my Dogsheep family of tools in a while. That changed this week: in particular, I shipped a 1.0of
github-to-sqlite .
As you might expect, it’s a tool for importing GitHub data into a SQLite database. Today it can handle repositories, releases, release assets, commits, issues and issue comments. You can see a live demo built from Dogsheep organization data at github-to-sqlite.dogsheep.net (deployed by this GitHub action).
I built this tool primarily to help me better keep track of all of my projects. Pulling the issues into a single database means I can run queries against all open issues across all of my repositories, and imporing commits and releases is handy for when I want to write my weeknotes and need to figure out what I’ve worked on lately. DATASETTE-RENDER-MARKDOWN GitHub issues use Markdown. To correctly display them it’s useful to be able to render that Markdown. I built datasette-render-markdownback in November
,
but this week I made some substantial upgrades: you can now configure which columns should be rendered,
and it includes support for Markdown extensions including GitHub-Flavored Markdown. You can see it in action on the github-to-sqlite demo.
I also upgraded datasette-render-timestampswith the same
explicit column configuration pattern. DATASETTE-PUBLISH-FLY Fly is a relatively new hosting provider which lets you host applications bundled as Docker containers in load-balanced data centers geographically close to your users. It has a couple of characteristics that make it a really good fit forDatasette.
Firstly, the pricing model : Fly will currently host a tiny (128MB of RAM) container for $2.67/month—and they give you $10/month of free service credit, enough for 3containers.
It turns out Datasette runs just fine in 128MB of RAM, so that’s three always-on Datasette containers! (Unlike Heroku and Cloud Run, Fly keeps your containers running rather than scaling them to zero). Secondly, it works by shipping it a Dockerfile. This means buildingdatasette publish
support for
it is really easy.
I added the publish_subcommand plugin hook to Datasette all the way back in 0.25in
September 2018, but I’ve never actually built anything with it. That’s now changed: datasette-publish-fly uses the hook to add a datasette publish fly command for publishing databases directly toyour Fly account.
HACKER-NEWS-TO-SQLITE It turns out I created my Hacker News account in 2007, and I’ve posted 2,167 comments and submitted 131 stories since then. Since my personal Dogsheep project is about pulling my data from multiple sources into a single place it made sense to build a tool for importing from Hacker News. hacker-news-to-sqliteuses the official
Hacker News API to import every comment and story posted by a specific user. It can also use one or more item IDs to suck the entire discussion tree around those items.The README
includes detailed documentation on how to best browse your data using Datasette once you have imported it.OTHER PROJECTS
* sqlite-utils gained some improvements to the way it suggests types for existing columns.* twitter-to-sqlite
now offers --sql and --attach for more of its subcommands. * datasette-show-errorsis a new plugin
which exposes 500 errors as tracebacks, like Django does with DEBUG=True. It’s built on top of Starlette’s ServerErrorMiddleware.
* I upgraded inaturalist-to-sqliteto work with
sqlite-utils 2.x.
5:33 am / 25th March 2020 / github , dogsheep , weeknotes , projects , datasette, markdown , sqlite
, jupyter
WEEKNOTES: THIS WEEK WAS ABSURD 15DAYS AGO
As of this morning, San Francisco is in a legally mandatedshelter-in-place
.
I can hardly remember what life was like seven days ago. It’s been a very long, very crazy week. This was not a great week for gettingstuff done.
So very short weeknotes today. * I started work on datasette-edit-tables , a plugin that will eventually allow Datasette users to modify tables—add columns, change the types of existing columns, rename tables and suchlike. So far all it offers is a delete table button.
* I released sqlite-utils 2.4.2 (and 2.4.1 before it) with a couple of bug fixes. Notably it does abetter job
detecting the types of different existing SQLite columns—it turns out SQLite supports a wide rangeof
cosmetic column types. * I did a bit of work on my github-contents library, which aims to make it as easy as possible to write code that updates text stored in a GitHub repository. I want to be able to use it to programmatically create pull requests (so I can add a visual editor to my cryptozoology crowdsourcing project). I added a branch_exists() methodand I’m working
on being able to commit to a branch other than master.
Natalie and I are hunkering down for the long run here in San Francisco, attempting to stay mentally healthy through aggressive use of Zoom and Google Hangouts. Best of luck to everyone out there getting through this thing. 2:52 am / 18th March 2020 / weeknotes , projects WEEKNOTES: COVID-19 NUMBERS IN DATASETTE 22DAYS AGO
COVID-19 , the
disease caused by the novel coronavirus, gets more terrifying every day. Johns Hopkins Center for Systems Science and Engineering (CSSE) have been collating data about the spread of the disease and publishing it as CSV files onGitHub.
This morning I used the pattern described in Deploying a data API using GitHub Actions and Cloud Runto
set up a scheduled task that grabs their data once an hour and publishes it to https://covid-19.datasettes.com/ as a table inDatasette.
If you’re not yet concerned about COVID-19 you clearly haven’t been paying atttention to what’s been happening in Italy. Here’s aquery
which shows a graph of the number of confirmed cases in Italy over the past few weeks (using datasette-vega):
155 cases 17 days ago to 10,149 cases today is really frightening. And the USA still doesn’t have robust testing in place, so the numbers here are likely to really shock people once they start to become moreapparent.
If you’re going to use the data in covid-19.datasettes.com for anything please be responsible with it and read the warnings in theREADME file
in detail: it’s important to fully understand the sources of the data and how it is being processed before you use it to make any assertions about the spread of COVID-10. My favourite resource to understand Coronavirus and what we should be doing about it is flattenthecurve.com , compiled by Julie McMurry , an assistant professor at Oregon State University College of Public Health. I strongly recommend checking itout.
OTHER PROJECTS
I’ve worked on a bunch of other projects this week, some of which were inspired by my time at NICAR.
* fec-to-sqlite is a script for saving FEC campaign finance filings to a SQLite database. Since those filings are pulled in via HTTP and can get pretty big, it uses a neat trick to generate a progress bar with the tqdm library—it initiates a progress bar with the Content-Length of the incoming file, then as it iterates over the lines coming in over HTTP it uses the length of each line to update that bar.
* datasette-search-all is a new plugin that enables search across multiple FTS-enabled SQLite tables at once. I wrote more about that in this blog poston
Monday.
* datasette-column-inspectis an extremely
experimental plugin that tries out a “column inspector” tool for Datasette tables—click on a column heading and theh plugin shows you interesting facts about that column, such as the min/mean/max/stdev, any outlying values, the most common values and the least common values. Screenshot below. This prototype came about as part of a JSK team project for the Designing Machine Learning course at Stanford—we were thinking about ways in which machine learning could help journalists find stories in large datasets. The prototype doesn’t have any machine learning in it—just some simple statistics to identify outliers—but it’s meant to illustrate how a tool that exposes machine learning insights against tabular data mightwork.
* github-to-sqlite
grew a new sub-command: github-to-sqlite commits github.db simonw/datasette—which imports information about commits to a repository (just the author and commit message, not the body of the commit itself). I’m running a private version of this against all of my projects, which is really useful for seeing what I worked on over the past week when writing my weeknotes. Here are two screenshots of datasette-column-inspect in action. You can try out a live demo of the plugin over here.
4:49 am / 11th March 2020 / weeknotes , projects , plugins , coronavirus , datasette DATASETTE-SEARCH-ALL: A NEW PLUGIN FOR SEARCHING MULTIPLE DATASETTE TABLES AT ONCE 24 DAYS AGO I just released a new plugin for Datasette, and it’s pretty fun. datasette-search-all is a plugin written mostly in JavaScript that executes the same search query against every searchable table in every database connected to your Datasette instance. You can try it out on my FARA (Foreign Agents Registration Act) search site, fara.datasettes.com —see Deploying a data API using GitHub Actions and Cloud Runfor
background on that project. Here’s a search for manafort across all four FARA tables (derived from CSVs originally pulled from the Justice Department bulk data site).
I’ve been planning to build cross-table search for Datasette for quite a while now. It’s a potentially very difficult problem: searching a single table is easy, but the moment you attempt to search multiple tables you run into a number of challenges: * Different tables have different columns. How do you present those in a single search interface? * Full-text search relevance scores make sense within a single table (due to the statistics they rely on, see Exploring search relevance algorithms with SQLite)
but cannot be compared across multiple tables. I have an idea for how I can address these, but it involves creating a single separate full-text index table that incorporates text from many different tables, along with a complex set of indexing mechanisms (maybe driven by triggers) for keeping it up to date. BUT MAYBE I WAS OVERTHINKING THIS? While I stewed on the ideal way to solve this problem, projects like my FARA site were stuck without cross-table search. Then this morning I realized that there was another way: I could build pretty much the simplest thing that could possibly work (always a good plan in myexperience).
Here’s how the new plugin works: it scans through every table attached to Datasette looking for tables that are configured for full-text search. Then it presents a UI which can excute searches against ALL of those tables, and present the top five results fromeach one.
The scanning-for-searchable-tables happens in Python,
but the actual searching is all in client-side JavaScript.
The searches run in parallel, which means the user sees results from the fastest (smallest) tables first, then the larger, slower tables drop in at the bottom. It’s stupidly simple, but I really like the result. It’s also a neat demonstration of running parallel SQL queries from JavaScript, a technique which I’m keen to apply to all sorts of other interestingproblems.
JAVASCRIPT STYLE
The JavaScript I wrote for this project is unconventional for 2020: it’s a block of inline script on the page, using no libraries or frameworks, but taking advantage of modern niceties like backticktemplate literals
and fetch()
.
The code is messy, short and extremely easy to change in the future. It doesn’t require running a build tool. I’m pretty happy with it. ADDING A SEARCH FORM TO THE HOMEPAGE The other thing the plugin does is add a search box to the Datasette homepage (as seen on the FARA site)—but only if the attached databases contain at least one FTS-configured searchable table. There are two parts to the implementation here. The first is a extra_template_vars() plugin hook which injects a searchable_tables variable into the hompage context—code here.
The second is a custom index.html template which ships with the plugin.
When Datasette renders a template it looks first in the local --template-dirs folder (if that option was used), secondly in all of the installed plugins and finally in the Datasette set of defaulttemplates.
The new index.html template starts with {% extends "default:index.html" %}, which means it extends the default template that shipped with Datasette. It then redefines the description_source_license block from that template to conditionally show the search form. I’m not at all happy with abusing description_source_license in this way—it just happens to be a block located at the top of that page. As I write more plugins that customize the Datasette UI in some way I continually run into this problem: plugins need to add markup to pages in specific points, but they also need to do so in a way that won’t over-ride what other plugins are up to. I’m beginning to formulate an idea for how Datasette can better support this, but until that’s ready I’ll be stuck with hacks likethe one used here.
USING THIS WITH DATASETTE-CONFIGURE-FTS AND DATASETTE-UPLOAD-CSVS The datasette-configure-fts pluginprovides a simple
UI for configuring search for different tables, by selecting which columns should be searchable. Combining this with datasette-search-all is really powerful. It means you can dump a bunch of CSVs into Datasette (maybe using datasette-upload-csvs), select some
columns and then run searches across all of those different data sources in one place. Not bad for 93 lines of JavaScript and a bit of Python glue! 12:59 am / 9th March 2020 / fulltextsearch , search , projects , plugins , datasette , javascript WEEKNOTES: DATASETTE-ICS, DATASETTE-UPLOAD-CSVS, DATASETTE-CONFIGURE-FTS, ASGI-CSRF29 DAYS AGO
I’ve been preparing for the NICAR 2020Data
Journalism conference this week which has lead me into a flurry of activity across a plethora of different projects and plugins.DATASETTE-ICS
NICAR publish their scheduleas a CSV file. I
couldn’t resist loading it into a Datasette on Glitch , which inspired me to put together a plugin I’ve been wanting for ages: datasette-ics, a
register_output_renderer() plugin that can produce a subscribable iCalendar file from an arbitrary SQL query. It’s based on datasette-atom and works in a similar way: you construct a query that outputs a required set of columns (event_name and event_dtstart as a minimum), then add the .ics extension to get back an iCalendar file. You can optionally also include event_dtend, event_duration, event_description, event_uid and most importantly event_tz, which can contain a timezone string. Figuring out how to handle timezones was the fiddliest part of the project.
If you’re going to NICAR, subscribe to https://nicar-2020.glitch.me/data/calendar.ics in a calendar application to get the full 261 item schedule. If you just want to see what the iCalendar feed looks like, add ?_plain=1 to preview it with a text/plain content type: https://nicar-2020.glitch.me/data/calendar.ics?_plain=1—and here’s the SQL query that powersit.
DATASETTE-UPLOAD-CSVS My work on Datasette Cloudis inspiring all
kinds of interesting work on plugins. I released datasette-upload-csvs a while ago, but now that Datasette has official write support I’ve been upgrading the plugin to hopefully achieve its fullpotential.
In particular, I’ve been improving its usability. CSV files can be big—and if you’re uploading 100MB of CSV it’s not particularly reassuring if your browser just sits for a few minutes spinning on thestatus bar.
So I added two progress bars to the plugins. The first is a client-side progress bar that shows you the progress of the initial file upload. I used the XMLHttpRequest pattern (and the drag-and-drop recipe) from Joseph Zimmerman’s useful article How To Make A Drag-and-Drop File Uploader With Vanilla JavaScript—fetch()
doesn’t reliably report upload progres just yet. I’m using Starlette and asyncio so uploading large files doesn’t tie up server resources in the same way that it would if I was using processes and threads. The second progress bar relates to server-side processing of the file: churning through 100,000 rows of CSV data and inserting them into SQLite can take a while, and I wanted users to be able to see what wasgoing on.
Here’s an animation screenshot of how the interface looks now: Implementing this was trickier. In the end I took advantage of the new dedicaed write thread made available by datasette.execute_write_fn()—since that thread has exclusive access to write to the database, I create a SQLite table called _csv_progress_ and write a new record to it every 10 rows. I use the number of bytes in the CSV file as the total and track how far through that file Python’s CSV parser has got using file.tell(). It seems to work really well. The full server-side code is here—the
progress bar itself then polls Datasette’s JSON API for the record in the _csv_progress_ table. DATASETTE-CONFIGURE-FTS SQLite ships with a decent implementation of full-text search. Datasette knows how to tell if a table has been configured for full-text search and adds a search box to the table page, documented here.
datasette-configure-ftsis a new plugin
that provides an interface for configuring search against existing SQLite tables. Under the hood it uses the sqlite-utils full-textsearch methods
to configure the table and set up triggers to keep the index updated as data in the table changes. It’s pretty simple, but it means that users of Datasette Cloud can upload a potentially enormous CSV file and then click to set specific columns as searchable. It’s a fun example of the kind of things that can be built with Datasette`s new write capabilities.ASGI-CSRF
CSRF is one of my favourite web application security vulnerabilties—I first wrote about it on thisblog back in 2005 !
I was surprised to see that the Starlette/ASGI ecosystem doesn’t yet have much in the way of CSRF prevention. The best option I could find to use the WTForms librarywith Starlette.
I don’t need a full forms library for my purposes (at least not yet) but I needed CSRF protection for datasete-configure-fts, so I’ve started working on a small ASGI middleware library called asgi-csrf.
It’s modelled on a subset of Django’s robust CSRF prevention.
The README warns people NOT to trust it yet—there are still some OWASP recommendations that it needs to apply (issue here) and I’m not yet
ready to declare it robust and secure. It’s a start though, and feels like exactly the kind of problem that ASGI middleware is meantto address.
2:27 am / 4th March 2020 / weeknotes , datajournalism , projects , datasette, security , ical ,
asgi , search , plugins , datasettecloud , csrfELSEWHERE
TODAY
* Several grumpy opinions about remote work at Tailscale . Really useful in-depth reviews of the tools Tailscale are using to build their remote company. “We decided early on—about the time we realized all three cofounders live in different cities—that we were going to go all-in on remote work, at least for engineering, which for now is almost all our work. As several people have pointed out before, fully remote is generally more stable than partly remote.” #30TH MARCH 2020
* gifcap (via
) This is really neat: a purely client-side implementation of animated gif screen capture, using navigator.mediaDevices.getDisplayMedia for the screen capturing, mithril for the UI and the gif.js pure JavaScript GIF encoding library to render the output. #27TH MARCH 2020
* PostGraphile: Production Considerations. PostGraphile is a
tool for building a GraphQL API on top of an existing PostgreSQL schema. Their “production considerations” documentation is particularly interesting because it directly addresses some of my biggest worries about GraphQL: the potential for someone to craft an expensive query that ties up server resources. PostGraphile suggests a number of techniques for avoiding this, including a statement timeout, a query whitelist, pagination caps and (in their “pro” version) a cost limit that uses a calculated cost score for the query. #26TH MARCH 2020
* Making Datasets Fly with Datasette and Fly(via
)
It’s always exciting to see a Datasette tutorial that wasn’t written by me! This one is great—it shows how to load Central Park Squirrel Census data into a SQLite database, explore it with Datasette and then publish it to the Fly hosting platform using datasette-publish-fly and datasette-cluster-map. #*
> Slack’s not specifically a “work from home” tool; it’s more > of a “create organizational agility” tool. But an all-at-once > transition to remote work creates a lot of demand for organizational> agility.
— Stewart Butterfield#
21ST MARCH 2020
* hacker-news-to-sqlite(via
) The latest in
my Dogsheep series of tools: hacker-news-to-sqlite uses the Hacker News API to fetch your comments and submissions from Hacker News and save them to a SQLite database. #19TH MARCH 2020
* Django: Added support for asynchronous views and middleware(via ) An
enormously consequential feature just landed in Django, and is set to ship as part of Django 3.1 in August. Asynchronous views will allow Django applications to define views using “async def myview(request)”—taking full advantage of Python’s growing asyncio ecosystem and providing enormous performance improvements for Django sites that do things like hitting APIs over HTTP. Andrew has been puzzling over this for ages and it’s really exciting to see it land in a form that should be usable in a stable Django release in just a few months. # * datasette-publish-fly(via
) Fly is a neat
new Docker hosting provider with a very tempting pricing model: Just $2.67/month for their smallest always-on instance, and they give each user $10/month in free credit. datasette-publish-fly is the first plugin I’ve written using the publish_subcommand plugin hook, which allows extra hosting providers to be added as publish targets. Install the plugin and you can run “datasette publish fly data.db” to deploy SQLite databases to your Fly account. #12TH MARCH 2020
* New governance model for the Django project. This
has been under discussion for a long time: I’m really excited to see it put into action. It’s difficult to summarize, but they key effect should be a much more vibrant, active set of people involved in making decisions about the framework. # * Announcing Daylight Map Distribution. Mike
Migurski announces a new distribution of OpenStreetMap: a 42GB dump of the version of the data used by Facebook, carefully moderated to minimize the chance of incorrect or maliciously offensive edits. Lots of constructive conversation in the comments about the best way for Facebook to make their moderation decisions more available to the OSMcommunity. #
9TH MARCH 2020
* The unexpected Google wide domain check bypass (via ) Fantastic story of discovering a devious security vulnerability in a bunch of Google products stemming from a single exploitable regular expression in the Google closure JavaScript library. #7TH MARCH 2020
*
> I called it normalization because then President Nixon was talking a > lot about normalizing relations with China. I figured that if he > could normalize relations, so could I.— Edgar F. Codd
#
5TH MARCH 2020
* Millions of tiny databases.
Fascinating, detailed review of a paper that describes Amazon’s Physalia, a distributed configuration store designed to provide extremely high availability coordination for Elastic Block Store replication. My eyebrows raised at “Physalia is designed to offer consistency and high-availability, even under network partitions.” since that’s such a blatant violation of CAP theorem, but it later justifies it like so: “One desirable property therefore, is that in the event of a partition, a client’s Physalia database will be on the same side of the partition as the client. Clever placement of cells across nodes can maximise the chances of this.” #* Search
* ©
* 2002
* 2003
* 2004
* 2005
* 2006
* 2007
* 2008
* 2009
* 2010
* 2011
* 2012
* 2013
* 2014
* 2015
* 2016
* 2017
* 2018
* 2019
* 2020
Details
Copyright © 2024 ArchiveBay.com. All rights reserved. Terms of Use | Privacy Policy | DMCA | 2021 | Feedback | Advertising | RSS 2.0